Replace text after special character - regex

I have string which should to be change from numbers to text in my case variable is:
$string = '18.3.0-31290741.41742-1'
I want to replace everything after '-' to be "-SNAPSHOT" and when perform echo $string to show information below. I tried with LastIndexOf(), Trim() and other things but seems not able to manage how to do it.
Expected result:
PS> echo $string
18.3.0-SNAPSHOT
Maybe that can be the light of the correct way, but when have two '-' is going to replace the last one not the first which can see:
$string = "18.3.0-31290741.41742-1" -replace '(.*)-(.*)', '$1-SNAPSHOT'

.* is a greedy match, meaning it will produce the longest matching (sub)string. In your case that would be everything up to the last hyphen. You need either a non-greedy match (.*?) or a pattern that won't match hyphens (^[^-]*).
Demonstration:
PS C:\> '18.3.0-31290741.41742-1' -replace '(^.*?)-.*', '$1-SNAPSHOT'
18.3.0-SNAPSHOT
PS C:\> '18.3.0-31290741.41742-1' -replace '(^[^-]*)-.*', '$1-SNAPSHOT'
18.3.0-SNAPSHOT
By using a positive lookbehind assertion ((?<=...)) you could eliminate the need for a capturing group and backreference:
PS C:\> "18.3.0-31290741.41742-1" -replace '(?<=^.*?-).*', 'SNAPSHOT'
18.3.0-SNAPSHOT

You could use Select-String and an regular expression to match the pattern, then pass the match to ForEach-Object (commonly shorthanded with alias %) to construct the final string:
$string = "18.3.0-31290741.41742-1" | Select-String -pattern ".*-.*-" | %{ "$($_.Matches.value)SNAPSHOT" }
$string

Related

Negative Lookbehind Works in Editor But Not in Powershell Script

Using the following. I am attempting to replace spaces with comma-space for all instances in a string. While avoiding repeating commas already present in the string.
Test string:
'186 ATKINS, Cindy Maria 25 Every Street Smalltown, Student'
Using the following code:
Get-Content -Path $filePath |
ForEach-Object {
$match = ($_ | Select-String $regexPlus).Matches.Value
$c = ($_ | Get-Content)
$c = $c -replace $match,', '
$c
}
The output is:
'186, ATKINS,, Cindy, Maria, 25, Every, Street, Smalltown,, Student'
My $regexPlus value is:
$regexPlus = '(?s)(?<!,)\s'
I have tested the negative lookbehind assertion in my editor and it works. Why does it not work in this Powershell script? The regex 101 online editor produces this curious mention of case sensitivity:
Negative Lookbehind (?<!,)
Assert that the Regex below does not match
, matches the character , with index 4410 (2C16 or 548) literally (case sensitive)
I have tried editing to:
$match = ($_ | Select-String $regexPlus -CaseSensitive).Matches.Value
But still not working. Any ideas are welcome.
Part of the problem here is that you are trying to force through the regex to do the replacement, when, like #WiktorStribiżew mentions, simply use -replace like it's supposed to be used. i.e. -replace does all the hard work for you.
When you do this:
$match = ($_ | Select-String $regexPlus).Matches.Value
You are right, you are trying to find Regex matches. Congratulations! It found a space character, but when you do this:
$c = $c -replace $match,', '
It interprets $match as a space character like this:
$c = $c -replace ' ',', '
And not as a regular expression that you might have been expecting. That's why it's not seeing the negative lookbehind for the commas, because all it is searching for are spaces, and it is dutifully replacing all the spaces with comma spaces.
The solution is simple in that, all you have to do is simply use the Regex text in the -replace string:
$regexPlus = '(?s)(?<!,)\s'
$c = $c -replace $regexPlus,', '
e.g. The negative lookbehind working as advertised:
PS C:> $str = '186 ATKINS, Cindy Maria 25 Every Street Smalltown, Student'
PS C:> $regexPlus = '(?s)(?<!,)\s'
PS C:> $str -replace $regexPlus,', '
186, ATKINS, Cindy, Maria, 25, Every, Street, Smalltown, Student
You can use
(Get-Content -Path $filePath) -replace ',*\s+', ', '
This code replaces zero or more commas and all one or more whitespaces after them with a single comma + space.
See the regex demo.
More details:
,* - zero or more commas
\s+ - one or more whitespace chars.

PowerShell RegEx: get SID from string

I am trying to run a powershell command to return a SID from a string.
$string = "The SID is S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3"
Select-String -Pattern "\bS-\d{1}-\d{1}-\d{10}-\d{10}-\d{9}-\d{10}-\d{10}-\d{1}-\d{1}-\d{1}-\d{1}-\d{1}\b" -InputObject $string
when I run the above, it returns the whole string but I only want the SID # which is 'S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3'
$Pattern = 'S-\d-(?:\d+-){1,14}\d+'
$Matches = Select-String -Pattern $Pattern -InputObject $string
if ( $Matches ) { $Matches.Matches.Value }
Credit for vs97's regex pattern
You can try the following regex:
S-\d-(?:\d+-){1,14}\d+
Regex Demo
Explanation:
S- # Match S- literally
\d- # Match a digit- literally
(?:\d+-){1,14} # Non-capturing group to match recursively digit- from 1-14 times
\d+ # Match digit recursively
As shown in JosefZ's helpful answer, your only problem was that you didn't extract the match of interest from the properties of the [Microsoft.PowerShell.Commands.MatchInfo] object that your Select-String call outputs.
However, using a cmdlet is a bit heavy-handed in this case; the -replace operator offers a simpler and better-performing alternative:
$string = "The SID is S-1-9-1551374245-3853148685-361627209-2951783729-4075502353-0-0-0-0-3"
$string -replace `
'.*\b(S-\d-\d-\d{10}-\d{10}-\d{9}-\d{10}-\d{10}-\d-\d-\d-\d-\d)\b.*',
'$1'
I've simplified your regex a bit: \d{1} -> \d. Note that it doesn't match all possible forms that SIDs can have.
Note how the regex matches the entire input string, and replaces it with just the what the capture group ((...), the subexpression matching the SID) matched ($1).
Your expression seems to be just fine, maybe a bit modification,
\bS(?:-\d){2}(?:-\d{10}){2}-\d{9}(?:-\d{10}){2}(?:-\d){5}\b
would be OK, other than that it should have worked though.
Demo 1
Or with a lookbehind:
(?<=The SID is )\S+
Demo 2

Add a word before and after a string

How can I add 2 words in front and behind of a regex matched string?
Example:
hi1,hi2,6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0d,hi3
Regex to match string: \b[A-Fa-f0-9]{64}\b
String: 6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0d
I want to add: hi1, hi2, hi3.
Use $& to reference the match in the replacement string:
$s = '6d37...ec0d'
$s -replace '\b[a-f0-9]{64}\b', 'hi1,hi2,$&,hi3'
Uppercase characters in the match expression are not required because PowerShell operators (-replace in this case) are case-insensitive by default.
Without knowing what to match, here is an example:
$str = 'klpo6d371555e08ba2b2397fd44a0db31605e7def831585c4c11dbb21c70d89e3b3551350e36d2cef84097077a4f5f12e5ee359625ec0f776403895039c4442860fa9968827ab119c8e8362c8a5cbef4389c2c36a08eda30ce091fe9a8e19f9eec0dputy'
if ($str -match '\b[A-Fa-f0-9]{64}\b'){
'hi1,hi2,{0},hi3' -f $matches[0]
}

Need a regex where two different substrings must not be included in a string

I have the following strings:
$string = #(
'Get-WindowsDevel'
'Put-WindowsDevel'
'Get-LinuxDevel'
'Put-LinuxDevel'
)
Now I need one regex with the following two rules:
$string must not start with "Get-"
$string must not contain "Linux"
This exclude the "Get-" at the beginning:
PS C:\> $string | Where-Object { $_ -match "^(?!Get-).*" }
Put-WindowsDevel Put-LinuxDevel
I would expect that the following command does not match "Put-LinuxDevel" but it does:
PS C:\> $string | Where-Object { $_ -match "^(?!Get-).*(?!Linux)" }
Put-WindowsDevel Put-LinuxDevel
So, what I need is a regex that is valid for this string only:
Put-WindowsDevel
Use -notmatch (or, if case-sensitive matching is needed, -cnotmatch) - i.e., a negated match - in combination with alternation (|):
PS> $string -notmatch '(^Get-|Linux)'
Put-WindowsDevel
The -match operator and its variations (as well as many other operators) can directly act on arrays as the LHS, in which case the operator acts as a filter and returns only matching elements.
Using -match on an array is much faster than using the Where-Object cmdlet in a pipeline for filtering.
Regex (^Get-|Linux) matches either Get- at the start of the string (^) or (|) substring Linux anywhere in the string.
Therefore, this regex matches strings that you don't want, and by using the negated form of -match - -notmatch - you therefore exclude those strings, as desired.
If you really want to express your regex as a positive match:
PS> $string -match '^(?!Get)((?!Linux).)*$'
Put-WindowsDevel
Note, however, that not only is this regex much more complex, it will also perform worse (albeit only slightly).
As for what you tried:
The .*(?!Linux) part of your regex - involving a negative lookahead assertion ((?!...)) - is not effective at excluding strings that contain substring Linux; e.g.:
PS> 'Linux' -match '.*(?!Linux)'
True # !!
The reason is that .* matches the entire string and then looks ahead to see if Linux isn't there - which is obviously true at the end of the string.
To effectively rule out a substring, the assertion must be applied around each character of the entire string:
PS> '', 'inux', 'Linux', 'a Linux', 'aLinuxb' -match '^((?!Linux).)*$'
# '' (empty string) matched
inux # 'inux' matched
Note how 'Linux', 'a Linux', and 'aLinuxb' were correctly excluded.
this seems to do what you seek [grin] ...
$StringList = #(
'Get-WindowsDevel'
'Put-WindowsDevel'
'Get-LinuxDevel'
'Put-LinuxDevel'
)
$ExcludeList = #(
'^get'
'linux'
)
$RegexExcludeList = $ExcludeList -join '|'
$StringList -notmatch $RegexExcludeList
output ...
Put-WindowsDevel

Trim More than 20 Characters

I am working on a script that will generate AD usernames based off of a csv file. Right now I have the following line working.
Select-Object #{n=’Username’;e={$_.FirstName.ToLower() + $_.LastName.ToLower() -replace "[^a-zA-Z]" }}
As of right now this takes the name and combines it into a AD friendly name. However I need to name to be shorted to no more than 20 characters. I have tried a few different methods to shorten the username but I haven't had any luck.
Any ideas on how I can get the username shorted?
Probably the most elegant approach is to use a positive lookbehind in your replacement:
... -replace '(?<=^.{20}).*'
This expression matches the remainder of the string only if it is preceded by 20 characters at the beginning of the string (^.{20}).
Another option would be a replacement with a capturing group on the first 20 characters:
... -replace '^(.{20}).*', '$1'
This captures at most 20 characters at the beginning of the string and replaces the whole string with just the captured group ($1).
$str[0..19] -join ''
e.g.
PS C:\> 'ab'[0..19]
ab
PS C:\> 'abcdefghijklmnopqrstuvwxyz'[0..19] -join ''
abcdefghijklmnopqrst
Which I would try in your line as:
Select-Object #{n=’Username’;e={(($_.FirstName + $_.LastName) -replace "[^a-z]").ToLower()[0..19] -join '' }}
([a-z] because PowerShell regex matches are case in-senstive, and moving .ToLower() so you only need to call it once).
And if you are using Strict-Mode, then why not check the length to avoid going outside the bounds of the array with the delightful:
$str[0..[math]::Min($str.Length, 19)] -join ''
To truncate a string in PowerShell, you can use the .NET String::Substring method. The following line will return the first $targetLength characters of $str, or the whole string if $str is shorter than that.
if ($str.Length -gt $targetLength) { $str.Substring(0, $targetLength) } else { $str }
If you prefer a regex solution, the following works (thanks to #PetSerAl)
$str -replace "(?<=.{$targetLength}).*"
A quick measurement shows the regex method to be about 70% slower than the substring method (942ms versus 557ms on a 200,000 line logfile)