PowerShell RegEx: get SID from string - regex

I am trying to run a powershell command to return a SID from a string.
$string = "The SID is S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3"
Select-String -Pattern "\bS-\d{1}-\d{1}-\d{10}-\d{10}-\d{9}-\d{10}-\d{10}-\d{1}-\d{1}-\d{1}-\d{1}-\d{1}\b" -InputObject $string
when I run the above, it returns the whole string but I only want the SID # which is 'S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3'

$Pattern = 'S-\d-(?:\d+-){1,14}\d+'
$Matches = Select-String -Pattern $Pattern -InputObject $string
if ( $Matches ) { $Matches.Matches.Value }
Credit for vs97's regex pattern

You can try the following regex:
S-\d-(?:\d+-){1,14}\d+
Regex Demo
Explanation:
S- # Match S- literally
\d- # Match a digit- literally
(?:\d+-){1,14} # Non-capturing group to match recursively digit- from 1-14 times
\d+ # Match digit recursively

As shown in JosefZ's helpful answer, your only problem was that you didn't extract the match of interest from the properties of the [Microsoft.PowerShell.Commands.MatchInfo] object that your Select-String call outputs.
However, using a cmdlet is a bit heavy-handed in this case; the -replace operator offers a simpler and better-performing alternative:
$string = "The SID is S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3"
$string -replace `
'.*\b(S-\d-\d-\d{10}-\d{10}-\d{9}-\d{10}-\d{10}-\d-\d-\d-\d-\d)\b.*',
'$1'
I've simplified your regex a bit: \d{1} -> \d. Note that it doesn't match all possible forms that SIDs can have.
Note how the regex matches the entire input string, and replaces it with just the what the capture group ((...), the subexpression matching the SID) matched ($1).

Your expression seems to be just fine, maybe a bit modification,
\bS(?:-\d){2}(?:-\d{10}){2}-\d{9}(?:-\d{10}){2}(?:-\d){5}\b
would be OK, other than that it should have worked though.
Demo 1
Or with a lookbehind:
(?<=The SID is )\S+
Demo 2

Related

Add a word before and after a string

How can I add 2 words in front and behind of a regex matched string?
Example:
hi1,hi2,6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0d,hi3
Regex to match string: \b[A-Fa-f0-9]{64}\b
String: 6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0d
I want to add: hi1, hi2, hi3.
Use $& to reference the match in the replacement string:
$s = '6d37...ec0d'
$s -replace '\b[a-f0-9]{64}\b', 'hi1,hi2,$&,hi3'
Uppercase characters in the match expression are not required because PowerShell operators (-replace in this case) are case-insensitive by default.
Without knowing what to match, here is an example:
$str = 'klpo6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0dputy'
if ($str -match '\b[A-Fa-f0-9]{64}\b'){
'hi1,hi2,{0},hi3' -f $matches[0]
}

Need a regex where two different substrings must not be included in a string

I have the following strings:
$string = #(
'Get-WindowsDevel'
'Put-WindowsDevel'
'Get-LinuxDevel'
'Put-LinuxDevel'
)
Now I need one regex with the following two rules:
$string must not start with "Get-"
$string must not contain "Linux"
This exclude the "Get-" at the beginning:
PS C:\> $string | Where-Object { $_ -match "^(?!Get-).*" }
Put-WindowsDevel Put-LinuxDevel
I would expect that the following command does not match "Put-LinuxDevel" but it does:
PS C:\> $string | Where-Object { $_ -match "^(?!Get-).*(?!Linux)" }
Put-WindowsDevel Put-LinuxDevel
So, what I need is a regex that is valid for this string only:
Put-WindowsDevel
Use -notmatch (or, if case-sensitive matching is needed, -cnotmatch) - i.e., a negated match - in combination with alternation (|):
PS> $string -notmatch '(^Get-|Linux)'
Put-WindowsDevel
The -match operator and its variations (as well as many other operators) can directly act on arrays as the LHS, in which case the operator acts as a filter and returns only matching elements.
Using -match on an array is much faster than using the Where-Object cmdlet in a pipeline for filtering.
Regex (^Get-|Linux) matches either Get- at the start of the string (^) or (|) substring Linux anywhere in the string.
Therefore, this regex matches strings that you don't want, and by using the negated form of -match - -notmatch - you therefore exclude those strings, as desired.
If you really want to express your regex as a positive match:
PS> $string -match '^(?!Get)((?!Linux).)*$'
Put-WindowsDevel
Note, however, that not only is this regex much more complex, it will also perform worse (albeit only slightly).
As for what you tried:
The .*(?!Linux) part of your regex - involving a negative lookahead assertion ((?!...)) - is not effective at excluding strings that contain substring Linux; e.g.:
PS> 'Linux' -match '.*(?!Linux)'
True # !!
The reason is that .* matches the entire string and then looks ahead to see if Linux isn't there - which is obviously true at the end of the string.
To effectively rule out a substring, the assertion must be applied around each character of the entire string:
PS> '', 'inux', 'Linux', 'a Linux', 'aLinuxb' -match '^((?!Linux).)*$'
# '' (empty string) matched
inux # 'inux' matched
Note how 'Linux', 'a Linux', and 'aLinuxb' were correctly excluded.
this seems to do what you seek [grin] ...
$StringList = #(
'Get-WindowsDevel'
'Put-WindowsDevel'
'Get-LinuxDevel'
'Put-LinuxDevel'
)
$ExcludeList = #(
'^get'
'linux'
)
$RegexExcludeList = $ExcludeList -join '|'
$StringList -notmatch $RegexExcludeList
output ...
Put-WindowsDevel

Regular Expressions in powershell split

I need to strip out a UNC fqdn name down to just the name or IP depending on the input.
My examples would be
\\tom.overflow.corp.com
\\123.43.234.23.overflow.corp.com
I want to end up with just tom or 123.43.234.23
I have the following code in my array which is striping out the domain name perfect, but Im still left with \\tom
-Split '\.(?!\d)')[0]
Your regex succeeds in splitting off the tokens of interest in principle, but it doesn't account for the leading \\ in the input strings.
You can use regex alternation (|) to include the leading \\ at the start as an additional -split separator.
Given that matching a separator at the very start of the input creates an empty element with index 0, you then need to access index 1 to get the substring of interest.
In short: The regex passed to -split should be '^\\\\|\.(?!\d)' instead of '\.(?!\d)', and the index used to access the resulting array should be [1] instead of [0]:
'\\tom.overflow.corp.com', '\\123.43.234.23.overflow.corp.com' |
ForEach-Object { ($_ -Split '^\\\\|\.(?!\d)')[1] }
The above yields:
tom
123.43.234.23
Alternatively, you could remove the leading \\ in a separate step, using -replace:
'\\tom.overflow.corp.com', '\\123.43.234.23.overflow.corp.com' |
ForEach-Object { ($_ -Split '\.(?!\d)')[0] -replace '^\\\\' }
Yet another alternative is to use a single -replace operation, which does not require a ForEach-Object call (doesn't require explicit iteration):
'\\tom.overflow.corp.com', '\\123.43.234.23.overflow.corp.com' -replace
'?(x) ^\\\\ (.+?) \.\D .+', '$1'
Inline option (?x) (IgnoreWhiteSpace) allows you to make regexes more readable with insignificant whitespace: any unescaped whitespace can be used for visual formatting.
^\\\\ matches the \\ (escaped with \) at the start (^) of each string.
(.+?) matches one or more characters lazily.
\.\D matches a literal . followed by something other than a digit (\d matches a digit, \D is the negation of that).
.+ matches one or more remaining characters, i.e., the rest of the input.
$1 as the replacement operand refers to what the 1st capture group ((...)) in the regex matched, and, given that the regex was designed to consume the entire string, replaces it with just that.
I'm stealing Lee_Daileys $InSTuff
but appending a RegEx I used recently
$InStuff = -split #'
\\tom.overflow.corp.com
\\123.43.234.23.overflow.corp.com
'#
$InStuff |ForEach-Object {($_.Trim('\\') -split '\.(?!\d{1,3}(\.|$))')[0]}
Sample Output:
tom
123.43.234.23
As you can see here on RegEx101 the dots between the numbers are not matched
The Select-String function uses regex and populates a MatchInfo object with the matches (which can then be queried).
The regex "(\.?\d+)+|\w+" works for your particular example.
"\\tom.overflow.corp.com", "\\123.43.234.23.overflow.corp.com" |
Select-String "(\.?\d+)+|\w+" | % { $_.Matches.Value }
while this is NOT regex, it does work. [grin] i suspect that if you have a really large number of such items, then you will want a regex. they do tend to be faster than simple text operators.
this will get rid of the leading \\ and then replace the domain name with .
# fake reading in a text file
# in real life, use Get-Content
$InStuff = -split #'
\\tom.overflow.corp.com
\\123.43.234.23.overflow.corp.com
'#
$DomainName = '.overflow.corp.com'
$InStuff.ForEach({
$_.TrimStart('\\').Replace($DomainName, '')
})
output ...
tom
123.43.234.23

Replace text after special character

I have string which should to be change from numbers to text in my case variable is:
$string = '18.3.0-31290741.41742-1'
I want to replace everything after '-' to be "-SNAPSHOT" and when perform echo $string to show information below. I tried with LastIndexOf(), Trim() and other things but seems not able to manage how to do it.
Expected result:
PS> echo $string
18.3.0-SNAPSHOT
Maybe that can be the light of the correct way, but when have two '-' is going to replace the last one not the first which can see:
$string = "18.3.0-31290741.41742-1" -replace '(.*)-(.*)', '$1-SNAPSHOT'
.* is a greedy match, meaning it will produce the longest matching (sub)string. In your case that would be everything up to the last hyphen. You need either a non-greedy match (.*?) or a pattern that won't match hyphens (^[^-]*).
Demonstration:
PS C:\> '18.3.0-31290741.41742-1' -replace '(^.*?)-.*', '$1-SNAPSHOT'
18.3.0-SNAPSHOT
PS C:\> '18.3.0-31290741.41742-1' -replace '(^[^-]*)-.*', '$1-SNAPSHOT'
18.3.0-SNAPSHOT
By using a positive lookbehind assertion ((?<=...)) you could eliminate the need for a capturing group and backreference:
PS C:\> "18.3.0-31290741.41742-1" -replace '(?<=^.*?-).*', 'SNAPSHOT'
18.3.0-SNAPSHOT
You could use Select-String and an regular expression to match the pattern, then pass the match to ForEach-Object (commonly shorthanded with alias %) to construct the final string:
$string = "18.3.0-31290741.41742-1" | Select-String -pattern ".*-.*-" | %{ "$($_.Matches.value)SNAPSHOT" }
$string

Trim More than 20 Characters

I am working on a script that will generate AD usernames based off of a csv file. Right now I have the following line working.
Select-Object #{n=’Username’;e={$_.FirstName.ToLower() + $_.LastName.ToLower() -replace "[^a-zA-Z]" }}
As of right now this takes the name and combines it into a AD friendly name. However I need to name to be shorted to no more than 20 characters. I have tried a few different methods to shorten the username but I haven't had any luck.
Any ideas on how I can get the username shorted?
Probably the most elegant approach is to use a positive lookbehind in your replacement:
... -replace '(?<=^.{20}).*'
This expression matches the remainder of the string only if it is preceded by 20 characters at the beginning of the string (^.{20}).
Another option would be a replacement with a capturing group on the first 20 characters:
... -replace '^(.{20}).*', '$1'
This captures at most 20 characters at the beginning of the string and replaces the whole string with just the captured group ($1).
$str[0..19] -join ''
e.g.
PS C:\> 'ab'[0..19]
ab
PS C:\> 'abcdefghijklmnopqrstuvwxyz'[0..19] -join ''
abcdefghijklmnopqrst
Which I would try in your line as:
Select-Object #{n=’Username’;e={(($_.FirstName + $_.LastName) -replace "[^a-z]").ToLower()[0..19] -join '' }}
([a-z] because PowerShell regex matches are case in-senstive, and moving .ToLower() so you only need to call it once).
And if you are using Strict-Mode, then why not check the length to avoid going outside the bounds of the array with the delightful:
$str[0..[math]::Min($str.Length, 19)] -join ''
To truncate a string in PowerShell, you can use the .NET String::Substring method. The following line will return the first $targetLength characters of $str, or the whole string if $str is shorter than that.
if ($str.Length -gt $targetLength) { $str.Substring(0, $targetLength) } else { $str }
If you prefer a regex solution, the following works (thanks to #PetSerAl)
$str -replace "(?<=.{$targetLength}).*"
A quick measurement shows the regex method to be about 70% slower than the substring method (942ms versus 557ms on a 200,000 line logfile)