PowerShell -replace to get string between two different characters - regex

I am current using split to get what I need, but I am hoping I can use a better way in powershell.
Here is the string:
server=ss8.server.com;database=CSSDatabase;uid=WS_CSSDatabase;pwd=abc123-1cda23-123-A7A0-CC54;Max Pool Size=5000
I want to get the server and database with out the database= or the server=
here is the method I am currently using and this is what I am currently doing:
$databaseserver = (($details.value).split(';')[0]).split('=')[1]
$database = (($details.value).split(';')[1]).split('=')[1]
This outputs to:
ss8.server.com
CSSDatabase
I would like it to be as simple as possible.
Thank you in advance

Replacing approach
You may use the following regex replace:
$s = 'server=ss8.server.com;database=CSSDatabase;uid=WS_CSSDatabase;pwd=abc123-1cda23-123-A7A0-CC54;Max Pool Size=5000'
$dbserver = $s -replace '^server=([^;]+).*', '$1'
$db = $s -replace '^[^;]*;database=([^;]+).*', '$1'
The technique is to match and capture (with (...)) what we need and just match what we need to remove.
Pattern details:
^ - start of the line
server= - a literal substring
([^;]+) - Group 1 (what $1 refers to) matching 1+ chars other than ;
.* - any 0+ chars other than a newline, as many as possible
Pattern 2 is almost the same, the capturing group is shifted a bit to capture another detail, and some more literal values are added to match the right context.
Note: if the values you need to extract may appear anywhere in the string, replace ^ in the first one and ^[^;]*; pattern in the second one with .*?\b (any 0+ chars other than a newline, as few as possible followed with a word boundary).
Matching approach
With a -match, you may do it the following way:
$s -match '^server=(.+?);database=([^;]+)'
The $Matches[1] will contain the server details and $Matches[2] will hold the DB info:
Name Value
---- -----
2 CSSDatabase
1 ss8.server.com
0 server=ss8.server.com;database=CSSDatabase
Pattern details
^ - start of string
server= - literal substring
(.+?) - Group 1: any 1+ non-linebreak chars as few as possible
;database= - literal substring
([^;]+) - 1+ chars other than ;

Another solution with a RegEx and named capture groups, similar to Wiktor's Matching Approach.
$s = 'server=ss8.server.com;database=CSSDatabase;uid=WS_CSSDatabase;pwd=abc123-1cda23-123-A7A0-CC54;Max Pool Size=5000'
$RegEx = '^server=(?<databaseserver>[^;]+);database=(?<database>[^;]+)'
if ($s -match $RegEx){
$Matches.databaseserver
$Matches.database
}

Related

Powershell - Should take only set of numbers from file name

I have a script that read a file name from path location and then he takes only the numbers and do something with them. Its working fine until I encounter with this situation.
For an example:
For the file name Patch_1348968.vip it takes the number 1348968.
In the case the file name is Patch_1348968_v1.zip it takes the number 13489681 that is wrong.
I am using this to fetch the numbers. In general it always start with patch_#####.vip with 7-8 digits so I want to take only the digits
before any sign like _ or -.
$PatchNumber = $file.Name -replace "[^0-9]" , ''
You can use
$PatchNumber = $file.Name -replace '.*[-_](\d+).*', '$1'
See the regex demo.
Details:
.* - any chars other than newline char as many as possible
[-_] - a - or _
(\d+) - Group 1 ($1): one or more digits
.* - any chars other than newline char as many as possible.
I suggest to use -match instead, so you don't have to think inverted:
if( $file.Name -match '\d+' ) {
$PatchNumber = $matches[0]
}
\d+ matches the first consecutive sequence of digits. The automatic variable $matches contains the full match at index 0, if the -match operator successfully matched the input string against the pattern.
If you want to be more specific, you could use a more complex pattern and extract the desired sub string using a capture group:
if( $file.Name -match '^Patch_(\d+)' ) {
$PatchNumber = $matches[1]
}
Here, the anchor ^ makes sure the match starts at the beginning of the input string, then Patch_ gets matched literally (case-insensitive), followed by a group of consecutive digits which gets captured () and can be extracted using $matches[1].
You can get an even more detailed explanation of the RegEx and the ability to experiment with it at regex101.com.

RegEx omit optional prefix in UPN or displayName

I am trying to get only the "nonpersonalizedusername" including its number or the surname.
To add more detail, I'd like to accomplish something like:
If there's an #-Symbol, get me everything that is in front of that #-Symbol, otherwise get me the whole string.
Plus, if then there's a dot "." in it, get me everything after that dot.
Let's assume I have the following stringsof userPrincipalNames and/or displayNames:
nonpersonalizedusername004
nonpersonalizedusername019#domaina.local
prefixc.nonpersonalizedusername044#domaina.local
nonpersonalizedusername038#domainb.local
prefixa.nonpersonalizedusername002#domaina.local
prefixb.nonpersonalizedusername038#domainb.local
givenname.surname
givenname.surname#domaina.local
What I got so far is this expression:
^(?:.*?\.)?(.+?)(?:#.*)?$
but this only works, if there's an #-Symbol AND that "prefixing"-Dot in the string OR neither Dot nor #-Symbol.
If there's an #-Symbol, but no prefixing-dot, I'm getting only that "local"-part from the end.
https://regex101.com/r/1aflGH/1
You can use
^(?:[^#.]*\.)?([^#]+)(?:#.*)?$
See the regex demo. The \n is added to the negated character classes at regex101 as the test is run against a single multiline string.
Details:
^ - start of string
(?:[^#.]*\.)? - an optional sequence of any zero or more chars other than # and . and then a .
([^#]+) - Group 1: one or more chars other than # char
(?:#.*)? - an optional sequence of # and then the rest of the line
$ - end of string.
You might optionally repeat matches until the last dot before the #, and then capture the rest after that do till the # in group 1.
^(?:[^#.]*\.)*([^#.]+)
The pattern matches:
^ Start of string
(?: Non capture group
[^#.]*\. Optionally repeat matching any char except # or ., then match .
)* Close non capture group and optionally repeat
( Capture group 1
[^#.]+
) Close group 1
Regex demo
Powershell example
$s = #"
nonpersonalizedusername004
nonpersonalizedusername019#domaina.local
prefixc.nonpersonalizedusername044#domaina.local
nonpersonalizedusername038#domainb.local
prefixa.nonpersonalizedusername002#domaina.local
prefixb.nonpersonalizedusername038#domainb.local
givenname.surname
givenname.surname#domaina.local
"#
Select-String '(?m)^(?:[^#.\n]*\.)*([^#.\n]+)' -input $s -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value}
Output
nonpersonalizedusername004
nonpersonalizedusername019
nonpersonalizedusername044
nonpersonalizedusername038
nonpersonalizedusername002
nonpersonalizedusername038
surname
surname

PowerShell Regular Expression match Y or Z

I am trying to match some strings using a regular expression in PowerShell but due to the differing format of the original string that I'm extracting from, encountering difficulty. I admittedly am not very strong with creating regular expressions.
I need to extract the numbers from each of these strings. These can vary in length but in both cases will be preceded by Foo
PC1-FOO1234567
PC2-FOO1234567/FOO98765
This works for the second example:
'PC2-FOO1234567/FOO98765' -match 'FOO(.*?)\/FOO(.*?)\z'
It lets me access the matched strings using $matches[1] and $matches[2] which is great.
It obviously doesn't work for the first example. I suspect I need some way to match on either / or the end of the string but I'm not sure how to do this and end up with my desired match.
Suggestions?
You may use
'FOO(.*?)(?:/FOO(.*))?$'
It will match FOO, then capture any 0 or more chars as few as possible into Group 1 and then will attempt to optionally match a sequence of patterns: /FOO, any 0 or more chars as many as possible captured into Group 2 and then the end of string position should follow.
See the regex demo
Details
FOO - literal substring
(.*?) - Group 1: any zero or more chars other than newline, as few as possible
(?:/FOO(.*))? - an optional non-capturing group matching 1 or 0 repetitions of:
/FOO - a literal substring
(.*) - Group 2: any 0+ chars other than newline as many as possible (* is greedy)
$ - end of string.
[edit - removed the unneeded pipe to Where-Object. thanks to mklement0 for that! [*grin*]]
this is a somewhat different approach. it splits on the foo, then replaces the unwanted / with nothing, and finally filters out any string that contains letters.
the pure regex solutions others offered will likely be faster, but this may be slightly easier to understand - and therefore to maintain. [grin]
# fake reading in a text file
# in real life, use Get-Content
$InStuff = #'
PC1-FOO1234567
PC2-FOO1234567/FOO98765
'# -split [environment]::NewLine
$InStuff -split 'foo' -replace '/' -notmatch '[a-z]'
output ...
1234567
1234567
98765
To offer a more concise alternative with the -split operator, which obviates the need to access $Matches afterwards to extract the numbers:
PS> 'PC1-FOO1234568', 'PC2-FOO1234567/FOO98765' -split '(?:^PC\d+-|/)FOO' -ne ''
1234568 # single match from 1st input string
1234567 # first of 2 matches from 2nd input string
98765
Note: -split always returns a [string[]] array, even if only 1 string is returned; result strings from multiple input strings are combined into a single, flat array.
^PC\d+-|/ matches PC followed by 1 or more (+) digits (\d) at the start of the string (^) or (|) a / char., which matches both PC2-FOO at the beginning and /FOO.
(?:...), a non-capturing subexpression, must be used to prevent -split from including what the subexpression matched in the results array.
-ne '' filters out the empty elements that result from the input strings starting with a separator.
To learn more about the regex-based -split operator and in what ways it is more powerful than the string literal-based .NET String.Split() method, see this answer.

How to detect the character before a number in RegEx

I have a string test_demo_0.1.1.
I want in PowerShell script to add before the 0.1.1 some text, for example: test_demo_shay_0.1.1.
I succeeded to detect the first number with RegEx and add the text:
$str = "test_demo_0.1.1"
if ($str - match "(?<number>\d)")
{
$newStr = $str.Insert($str.IndexOf($Matches.number) - 1, "_shay")-
}
# $newStr = test_demo_shay_0.1.1
The problem is, sometimes my string includes a number in another location, for example: test_demo2_0.1.1 (and then the insert is not good).
So I want to detect the first number which the character before is _, how can I do it?
I tried "(_<number>\d)" and "([_]<number>\d)" but it doesn't work.
What you ask for is called a positive lookbehind (a construct that checks for the presence of some pattern immediately to the left of thew current location):
"(?<=_)(?<number>\d)"
^^^^^^
However, it seems all you want is to insert _shay before the first digit preceded with _. A replace operation will suit here best:
$str -replace '_(\d.*)', '_shay_$1'
Result: test_demo_shay_0.1.1.
Details
_ - an underscore
(\d.*) - Capturing group #1: a digit and then any 0+ chars to the end of the line.
The $1 in the replacement pattern is the contents matched by the capturing group #1.

Explode string with comma when comma is not inside any brackets

I have string "xyz(text1,(text2,text3)),asd" I want to explode it with , but only condition is that explode should happen only on , which are not inside any brackets (here it is ()).
I saw many such solutions on stackoverflow but it didn't work with my pattern. (example1) (example2)
What is correct regex for my pattern?
In my case xyz(text1,(text2,text3)),asd
result should be
xyz(text1,(text2,text3)) and asd.
You may use a matching approach using a regex with a subroutine:
preg_match_all('~\w+(\((?:[^()]++|(?1))*\))?~', $s, $m)
See the regex demo
Details
\w+ - 1+ word chars
(\((?:[^()]++|(?1))*\))? - an optional capturing group matching
\( - a (
(?:[^()]++|(?1))* - zero or more occurrences of
[^()]++ - 1+ chars other than ( and )
| - or
(?1) - the whole Group 1 pattern
\) - a ).
PHP demo:
$rx = '/\w+(\((?:[^()]++|(?1))*\))?/';
$s = 'xyz(text1,(text2,text3)),asd';
if (preg_match_all($rx, $s, $m)) {
print_r($m[0]);
}
Output:
Array
(
[0] => xyz(text1,(text2,text3))
[1] => asd
)
If the requirement is to split at , but only outside nested parenthesis another idea would be to use preg_split and skip the parenthesized stuff also by use of a recursive pattern.
$res = preg_split('/(\((?>[^)(]*(?1)?)*\))(*SKIP)(*F)|,/', $str);
See this pattern demo at regex101 or a PHP demo at eval.in
The left side of the pipe character is used to match and skip what is inside the parenthesis.
On the right side it will match remaining commas that are left outside of the parenthesis.
The pattern used is a variant of different common patterns to match nested parentehsis.