regex in powershell - not change three characters before text - regex

Is there any easy way to do this?
input: 123215-85_01_test
expected output: 01_test
Another example
input: 12154_02_test
expected output: 02_test
There will be always string "test", but different numbering before
for example this code..
$path = "c:\tmp\*.sql"
get-childitem $path | forEach-object {
$name = $_.Name
$result = $name -replace "","" # I don't know how write this regex..
$extension = $_.Extension
$newName = $prefix+"_"+ $result -f, $extension
Rename-Item -Path $_.FullName -NewName $newName
}

There are two ways you go go at this. Simple split and join or you can use one of many regexes....
Split on underscore and rejoin last 2 elements
$split = "123215-85_01_test" -split "_"
$split[-2..-1] -join "_" # $split[-2,-1] would also work.
Regex to locate the data between the last underscores
"123215-85_01_test" -replace "^.*_(\d+)_(.*)$", '$1_$2'
Note this fails if there is more than 2 underscores.

Related

Using -like to find strings with Powershell

In Powershell, I need to find multiple errors within a text file and the closest desired word between them. Each error is contained in an array. In the past, I would use the code:
#creating null array
$results = #("")
#creating index for array
for ($i = 0; $i -lt ($errors.length - 1); $i++)
{
$results += $false
}
#selecting string
for ($i = 0; $i -lt $errors.length; $i++)
{
$k = $errors[$i]
$rg = [regex]"WORD.*?$k.*?WORD"
$results[$i] = $content | Select-String -Pattern $rg -AllMatches | Foreach-Object {($_.Matches |
ForEach-Object {$_.value})}
}
$errors is the array of errors, $content is the content of the text file, and each item in $results holds the string from desired word to error to desired word.
Using $results[$i] = $content | Select-String -Pattern $rg -AllMatches | Foreach-Object {($_.Matches | ForEach-Object {$_.value})} does not work because my errors contain wild cards like asterisks.
I know that in order for me to do such characters, I need to use -like
I have tried using $results[$i] = $content -like $k instead, but that only returns a null value.
You can try:
for ($i = 0; $i -lt $errors.length; $i++)
{
$k = [regex]::Escape($errors[$i])
$rg = [regex]"(?s)WORD.*?$k.*?WORD"
$results[$i] = $content | Select-String -Pattern $rg -AllMatches |
Foreach-Object { $_.Matches.Value }
}
I do not know what is inside of $errors, $results, or $content. For what you want to work as intended while $content spans multiple lines, then $content will need to be a single string before being piped into Select-String. Otherwise, your regex will only match against one line at a time.
Since you may want .*? to match multiple lines, you should use the (?s) mode so that . can match new line characters.
If $errors contains characters that have special meaning in regex, you will need to backslash escape them. An easy way to do that is to call the Regex .NET class method Escape().

Replacing any content inbetween second and third underscore

I have a PowerShell Scriptline that replaces(deletes) characters between the second and third underscore with an "_":
get-childitem *.pdf | rename-item -newname { $_.name -replace '_\p{L}+, \p{L}+_', "_"}
Examples:
12345_00001_LastName, FirstName_09_2018_Text_MoreText.pdf
12345_00002_LastName, FirstName-SecondName_09_2018_Text_MoreText.pdf
12345_00003_LastName, FirstName SecondName_09_2018_Text_MoreText.pdf
This _\p{L}+, \p{L}+_ regex only works for the first example. To replace everything inbetween I have used _(?:[^_]*)_([^_]*)_ (according to regex101 this should almost work) but the output is:
12345_09_MoreText.pdf
The desired output would be:
12345_00001_09_2018_Text_MoreText.pdf
12345_00002_09_2018_Text_MoreText.pdf
12345_00003_09_2018_Text_MoreText.pdf
How do I correctly replace the second and third underscore and everything inbetween with an "_"?
If you don't want to use regex -
$files = get-childitem *.pdf #get all pdf files
$ModifiedFiles, $New = #() #declaring two arrays
foreach($file in $files)
{
$ModifiedFiles = $file.split("_")
$ModifiedFiles = $ModifiedFiles | Where-Object { $_ -ne $ModifiedFiles[2] } #ommitting anything between second and third underscore
$New = "$ModifiedFiles" -replace (" ", "_")
Rename-Item -Path $file.FullName -NewName $New
}
Sample Data -
$files = "12345_00001_LastName, FirstName_09_2018_Text_MoreText.pdf", "12345_00002_LastName, FirstName-SecondName_09_2018_Text_MoreText.pdf", "12345_00003_LastName, FirstName SecondName_09_2018_Text_MoreText.pdf"
$ModifiedFiles, $New = #() #declaring two arrays
foreach($file in $files)
{
$ModifiedFiles = $file.split("_")
$ModifiedFiles = $ModifiedFiles | Where-Object { $_ -ne $ModifiedFiles[2] } #ommitting anything between second and third underscore
$New = "$ModifiedFiles" -replace (" ", "_")
}
You may use
-replace '^((?:[^_]*_){2})[^_]+_', '$1'
See the regex demo
Details
^ - start of the line
((?:[^_]*_){2}) - Group 1 (the value will be referenced to with $1 from the replacement pattern): two repetitions of
[^_]* - 0+ chars other than an underscore
_ - an underscore
[^_]+ - 1 or more chars other than _
_ - an underscore
To offer an alternative solution that avoids a complex regex: The following is based on the -split and -join operators and shows PowerShell's flexibility with respect to array slicing:
Get-ChildItem *.pdf | Rename-Item { ($_.Name -split '_')[0..1 + 3..6] -join '_' } -WhatIf
$_.Name -split '_' splits the filename by _ into an array of tokens (substrings).
Array slice [0..1 + 3..6] combines two range expressions (..) to essentially remove the token with index 2 from the array.
-join '_' reassembles the modified array into a _-separated string, yielding the desired result.
Note: 6, the upper array bound, is hard-coded above, which is suboptimal, but sufficient with input as predictable as in this case.
As of Windows PowerShell v5.1 / PowerShell Core 6.1.0, in order to determine the upper bound dynamically, you require the help of an auxiliary variable, which is clumsy:
Get-ChildItem *.pdf |
Rename-Item { ($arr = $_.Name -split '_')[0..1 + 3..($arr.Count-1)] -join '_' } -WhatIf
Wouldn't it be nice if we could write [0..1 + 3..] instead?
This and other improvements to PowerShell's slicing syntax are the subject of this feature suggestion on GitHub.
here's one other way ... using string methods.
'12345_00003_LastName, FirstName SecondName_09_2018_Text_MoreText.pdf'.
Split('_').
Where({
$_ -notmatch ','
}) -join '_'
result = 12345_00003_09_2018_Text_MoreText.pdf
that does the following ...
split on the underscores
toss out any item that has a comma in it
join the remaining items back into a string with underscores
i suspect that the pure regex solution will be faster, but you may want to use this simply to have something that is easier to understand when you next need to modify it. [grin]

Replace until a pattern is found in PowerShell

I have been trying to transform the following string:
CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM
to:
FR.CONTOSO.COM
As a first step, I tried to remove everything until ",DC" pattern.
I thought I could use the un-greedy ".+?" to match everything until the first ",DC" pattern:
$str = 'CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM'
$str -replace '.+?,DC', ''
This returns:
=COM
Any idea why it's getting only the last one even with the un-greedy version?
How could I do this?
Just split the string:
$parts = ('CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM' -split ',')
$newString = '{0}.{1}.{2}' -f $parts[-3].split('=')[1], $parts[-2].split('=')[1], $parts[-1].split('=')[1]
OK, so I took #EBGreen's advice and decided to make a function that does all the work:
#Requires -Version 4
Function ConvertFrom-DistinguishedName
{
[CmdletBinding()]
[OutputType('System.Management.Automation.PSCustomObject')]
Param(
[Parameter(Position = 0, Mandatory)]
[Alias('DistinguishedName', 'Name')]
[ValidatePattern('^CN=.+?,(OU=.+?,)+DC=.+$')]
[string]
$Path
)
$local:ErrorActionPreference = [System.Management.Automation.ActionPreference]::Stop
$WS = $Path -split ',' # working set
$User = $WS[0] -replace 'CN='
$Domain = $WS.Where({$PSItem.StartsWith('DC=')}).ToLower() -replace 'dc=' -join '.'
$OU = $WS.Where({$PSItem.StartsWith('OU=')}) -replace 'OU='
[array]::Reverse($OU)
[pscustomobject]#{
'User' = $User
'Domain' = $Domain
'OU' = $OU -join '\'
}
}
PS C:\> ConvertFrom-DistinguishedName 'CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM'
User Domain OU
---- ------ --
John Doe fr.contoso.com HQ\Department\Support\IT
The end result? This converts your distinguished AD name to a PSCustomObject you can easily work with.
To get the domain out of the Distinguishedname, you will want to use -Split. Finally we can use -join to insert '.' between the DCs
This will store your domain string to $domain:
$dn = "CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM"
$domain = $dn -Split "," | ? {$_ -like "DC=*"}
$domain = $domain -join "." -replace ("DC=", "")
Write-Host "Current Domain: " $domain
You should use the ^ to start from the beginning.
$str = 'CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC=FR,DC=CONTOSO,DC=COM'
$str = $str -replace '^.+?,DC',''
$str = $str -replace 'DC=',''
$str -replace ',','.'
Since nobody bothered explaining the behavior yet:
The replacement operation replaces every occurrence of the pattern .+?,DC in the string, so it first removes CN=John Doe,OU=IT,OU=Support,OU=Department,OU=HQ,DC, then continues where that replacement left off and removes =FR,DC, and then =CONTOSO,DC, leaving you with just =COM.
To avoid this behavior you need to anchor the expression at the beginning of the string (^), as others have already suggested. The second replacement for substituting the remaining ,DC= with dots can be daisy-chained to the first one, so you need just one statement:
$str -replace '^.*?,dc=' -replace ',dc=', '.'

Select all backslashes between two chars

I am working on a powershell script and I've got several text files where I need to replace backslashes in lines which matches this pattern: .. >\\%name% .. < .. (.. could be anything)
Example string from one of the files where the backslashes should match:
<Tag>\\%name%\TST$\Program\1.0\000\Program.msi</Tag>
Example string from one of the files where the backslashes should not match:
<Tag>/i /L*V "%TST%\filename.log" /quiet /norestart</Tag>
So far I've managed to select every char between >\\%name% and < with this expression (Regex101):
(?<=>\\\\%name%)(.*)(?=<)
but I failed to select only the backslashes.
Is there a solution which I could not yet find?
I'd recommend selecting the relevant tags with an XPath expression and then do the replacement on the text body of the selected nodes.
$xml.SelectNodes('//Tag[substring(., 1, 8) = "\\%name%"]' | ForEach-Object {
$_.'#text' = $_.'#text' -replace '\\', '\\'
}
So here's my solution:
$original_file = $Filepath
$destination_file = $Filepath + ".new"
Get-Content -Path $original_file | ForEach-Object {
$line = $_
if ($line -match '(?<=>\\\\%name%)(.*)(?=<)'){
$line = $line -replace '\\','/'
}
$line
} | Set-Content -Path $destination_file
Remove-Item $original_file
Rename-Item $destination_file.ToString() $original_file.ToString()
So this will replace every \ with an / in the given pattern but not in the way which my question was about.

Replace different occurences of String with different values in powershell?

I am pretty new to powershell scripting.The scenario is that I have to replace the first occurrence of a string with different value and second occurrence with a different value.
So far, I have this :
$dbS = Select-String $repoPath\AcceptanceTests\sample.config -Pattern([regex]'dbServer = "#DB_SERVER#"')
write-output $dbS[0]
write-output $dbS[1]
This gives the output as :
D:\hg\default\AcceptanceTests\sample.config:5: dbServer = "#DB_SERVER#"
D:\hg\default\AcceptanceTests\sample.config:12: dbServer = "#DB_SERVER#"
I can see that both the occurrences are correct, and this returns a MatchInfo object.Now I need to replace the contents,I tried :
Get-Content $file | ForEach-Object { $_ -replace "dbserver",$dbS[0] } | Set-Content ($file+".tmp")
Remove-Item $file
Rename-Item ($file+".tmp") $file
But this replaces all occurence and that too with the entire path. Please help..
Here is what i have come up with:
$dbs = Select-String .\test.config -pattern([regex]'dbServer = "Test1"')
$file = Get-Content .\test.config
$dbs | % {$file[$_.linenumber-1] = $file[$_.linenumber-1] -replace "Test1", "Test3" }
set-content .\test.config $file
It cycles through all results of Select-String and uses its .LineNumber Property (-1) as array index to replace the text only in that line. Next we just set the content again.
If you want to assign different Values for occurance 1 and 2 you can do this:
#replace first occurance
$file[$dbs[0].LineNumber-1] = $file[$dbs[0].LineNumber-1] -replace "Test1", "Test2"
#replace second occurance
$file[$dbs[1].LineNumber-1] = $file[$dbs[1].LineNumber-1] -replace "Test1", "Test3"
This approach obviously only works if you know how many occurances you will have and which of them you want to replace.