Regex to remove what ever comes in front of "\" using powershell - regex

wanted one help, wanted a regex to eliminate a "\" and what ever come before it,
Input should be "vmvalidate\administrator"
and the output should be just "administrator"

$result = $subject -creplace '^[^\\]*\\', ''
removes any non-backslash characters at the start of the string, followed by a backslash:
Explanation:
^ # Start of string
[^\\]* # Match zero or more non-backslash characters
\\ # Match a backslash
This means that if there is more than one backslash in the string, only the first one (and the text leading up to it) will be removed. If you want to remove everything until the last backslash, use
$result = $subject -creplace '(?s)^.*\\', ''

No need to use regex, try the split method:
$string.Split('\')[-1]

"vmvalidate\administrator" -replace "^.*?\\"
^ - from the begin of string
.* - any amount of any chars
? - lazy mode of quantifier
\ - "backslash" using escape character ""
All together it means "Replace all characters from the begin of string until backslash"

This is the way I used to do things before I learned about regex or splitting.
"vmvalidate\administrator".SubString("vmvalidate\administrator".IndexOf('\')+1)

Related

matching two chars with multiple lines in between

I am new to regex and I am using Perl.
I have below tag:
<CFSC>cfsc_service=TRUE
SEC=1
licenses=10
expires=20170511
</CFSC>
I want to match anything between <CFSC> and </CFSC> tags.
I tried /<CFSC>.*?\n.*?\n.*?\n.*?\n<\/CFSC>/
and /<CFSC>(.*)<\/CFSC>/ but had no luck.
You need the /s single line modifier to make the regex engine include line breaks in ..
Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
See this example.
my $foo = qq{<CFSC>cfsc_service=TRUE
SEC=1
licenses=10
expires=20170511
</CFSC>};
$foo =~ m{>(.*)</CFSC>}s;
print $1;
You also need to use a different delimiter than /, or escape it.
Try
/<CFSC>(.*)<\/CFSC>/s
The final s makes the . match newline chars (\n = 0x0a) which is usually doesn't match:
Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.
from http://perldoc.perl.org/perlre.html#Modifiers
Try this:
$foo =~ m/<CFSC>((?:(?!<\/CFSC>).)*)<\/CFSC>/gs;
Modifiers:
g - Matches global
s - newline
i - case sensitive
\ - escape sequence

perl regex - quantifier * not greedy enough to pickup the newline at end of string

Is it not quantifier * , greedy ? Should not \s* match 0 or more occurence of white spaces,and which in turn would match everything till end of the given input string ?
#!/usr/bin/perl
use strict;
use warnings;
my $input="Name : www.devserver.com\n";
$input=~s/\w+.:\s*//; # /s* should not it match everthing till \n at the end ?
print $input;
Please help me understand this behaviour.
\s* will match only a string consisting entirely of characters of the same class (namely, whitespace).
In your case, there is www.devserver.com between the leading and trailing spaces.
You may have tried to use . class instead of \s:
$input=~s/\w+.:.*//;
This also wouldn't touch the trailing newline! According to perlre:
To simplify multi-line substitutions, the "." character never matches a newline unless you use the /s modifier, which in effect tells Perl to pretend the string is a single line--even if it isn't.
So, wrapping it up: the behavior you are expecting can be reproduced with the following substitution:
$input=~s/\w+.:.*//s;

match parentheses in powershell using regex

I'm trying to check for invalid filenames. I want a filename to only contain lowercase, uppercase, numbers, spaces, periods, underscores, dashes and parentheses. I've tried this regex:
$regex = [regex]"^([a-zA-Z0-9\s\._-\)\(]+)$"
$text = "hel()lo"
if($text -notmatch $regex)
{
write-host 'not valid'
}
I get this error:
Error: "parsing "^([a-zA-Z0-9\s\._-\)\(]+)$" - [x-y] range in reverse order"
What am I doing wrong?
Try to move the - to the end of the character class
^([a-zA-Z0-9\s\._\)\(-]+)$
in the middle of a character class it needs to be escaped otherwise it defines a range
You can replace a-zA-Z0-9 and _ with \w.
$regex = [regex]"^([\w\s\.\-\(\)]+)$"
From get-help about_Regular_Expressions:
\w
Matches any word character.
Equivalent to the Unicode
character categories [\p{Ll}
\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}].
If ECMAScript-compliant behavior
is specified with the ECMAScript
option, \w is equivalent to
[a-zA-Z_0-9].
I guess, add a backslash before the lone hyphen:
$regex = [regex]"^([a-zA-Z0-9\s\._\-\)\(]+)$"

PERL-Subsitute any non alphanumerical character to "_"

In perl I want to substitute any character not [A-Z]i or [0-9] and replace it with "_" but only if this non alphanumerical character occurs between two alphanumerical characters. I do not want to touch non-alphanumericals at the beginning or end of the string.
I know enough regex to replace them, just not to only replace ones in the middle of the string.
s/(\p{Alnum})\P{Alnum}(\p{Alnum})/${1}_${2}/g;
Of course that would hurt your chanches with "#A#B%C", so you might use a look-arounds:
s/(?<=\p{Alnum})\P{Alnum}(?=\p{Alnum})/_/g;
That way you isolate it to just the non "alnum" character.
Or you could use the "keep flag", as well and get the same thing done.
s/\p{Alnum}\K\P{Alnum}(?=\p{Alnum})/_/g;
EDIT based on input:
To not eat a newline, you could do the following:
s/\p{Alnum}\K[^\p{Alnum}\n](?=\p{Alnum})/_/g;
Try this:
my $str = 'a-2=c+a()_';
$str =~ s/(?<=[A-Z0-9])[^A-Z0-9](?=[A-Z0-9])/\1_\2/gi;

How can I match double-quoted strings with escaped double-quote characters?

I need a Perl regular expression to match a string. I'm assuming only double-quoted strings, that a \" is a literal quote character and NOT the end of the string, and that a \ is a literal backslash character and should not escape a quote character. If it's not clear, some examples:
"\"" # string is 1 character long, contains dobule quote
"\\" # string is 1 character long, contains backslash
"\\\"" # string is 2 characters long, contains backslash and double quote
"\\\\" # string is 2 characters long, contains two backslashes
I need a regular expression that can recognize all 4 of these possibilities, and all other simple variations on those possibilities, as valid strings. What I have now is:
/".*[^\\]"/
But that's not right - it won't match any of those except the first one. Can anyone give me a push in the right direction on how to handle this?
/"(?:[^\\"]|\\.)*"/
This is almost the same as Cal's answer, but has the advantage of matching strings containing escape codes such as \n.
The ?: characters are there to prevent the contained expression being saved as a backreference, but they can be removed.
NOTE: as pointed out by Louis Semprini, this is limited to 32kb texts due a recursion limit built into Perl's regex engine (that unfortunately silently returns a failure when hit, instead of crashing loudly).
How about this?
/"([^\\"]|\\\\|\\")*"/
matches zero or more characters that aren't slashes or quotes OR two slashes OR a slash then a quote
A generic solution(matching all backslashed characters):
/ \A " # Start of string and opening quote
(?: # Start group
[^\\"] # Anything but a backslash or a quote
| # or
\\. # Backslash and anything
)* # End of group
" \z # Closing quote and end of string
/xms
See Text::Balanced. It's better than reinvent wheel. Use gen_delimited_pat to see result pattern and learn form it.
RegExp::Common is another useful tool to be aware of. It contains regexps for many common cases, included quoted strings:
use Regexp::Common;
my $str = '" this is a \" quoted string"';
if ($str =~ $RE{quoted}) {
# do something
}
Here's a very simple way:
/"(?:\\?.)*?"/
Just remember if you're embedding such a regex in a string to double the backslashes.
Try this piece of code : (\".+")