Powershell regex group replacing - regex

I want to replace some text in every script file in folder, and I'm trying to use this PS code:
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | ForEach-Object { (Get-Content $_.fullname) -replace $pattern, 'replace text' | Set-Content $_.fullname }
But I have no idea how to keep first part of expression, and just replace the second one. Any idea how can I do this? Thanks.

Not sure that provided regex for tables names is correct, but anyway you could replace with captures using variables $1, $2 and so on, and following syntax: 'Doe, John' -ireplace '(\w+), (\w+)', '$2 $1'
Note that the replacement pattern either needs to be in single quotes ('') or have the $ signs of the replacement group specifiers escaped ("`$2 `$1").
# may better replace with $pattern = '(FROM) (?<replacement_place>[a-zA-Z0-9_.]{1,7})'
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | % `
{
(Get-Content $_.fullname) | % `
{ $_-replace $pattern, '$1 replace text' } |
Set-Content $_.fullname -Force
}
If you need to reference other variables in your replacement expression (as you may), you can use a double-quoted string and escape the capture dollars with a backtick
{ $_-replace $pattern, "`$1 replacement text with $somePoshVariable" } |

Related

Powershell regex only select digits

I have a script that I am working on to parse each line in the log. My issue is the regex I use matches from src= until space.
I only want the ip address not the src= part. But I do still need to match from src= up to space but in the result only store digits. Below is what I use but it sucks really badly. So any help would be helpful
#example text
$destination=“src=192.168.96.112 dst=192.168.5.22”
$destination -match 'src=[^\s]+'
$result = $matches.Values
#turn it into string since trim doesn’t work
$result=echo $result
$result=$result.trim(“src=”)
You can use a lookbehind here, and since -match only returns the first match, you will be able to access the matched value using $matches[0]:
$destination -match '(?<=src=)\S+' | Out-Null
$matches[0]
# => 192.168.96.112
See the .NET regex demo.
(?<=src=) - matches a location immediately preceded with src=
\S+ - one or more non-whitespace chars.
To extract all these values, use
Select-String '(?<=src=)\S+' -input $destination -AllMatches | Foreach {$_.Matches} | Foreach-Object {$_.Value}
or
Select-String '(?<=src=)\S+' -input $destination -AllMatches | % {$_.Matches} | % {$_.Value}
Another way could be using a capturing group:
src=(\S+)
Regex demo | Powershell demo
For example
$destination=“src=192.168.96.112 dst=192.168.5.22”
$pattern = 'src=(\S+)'
Select-String $pattern -input $destination -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value}
Output
192.168.96.112
Or a bit more specific matching the dot and the digits (or see this page for an even more specific match for an ip number)
src=(\d{1,3}(?:\.\d{1,3}){3})

PowerShell - How to Update a file based on content from another file

I've searched all over including here at StackOverFlow and I cannot seem to find the solution I am needing help with. Here is my issue.
Lets say in File1.txt I have the following (no spaces between each line)
\\Serv02\LOC6\Client\726C30\032383\2200018023.pdf
\\Serv02\LOC6\Client\726C30\032383\2200718091.pdf
\\Serv02\LOC6\Client\726C30\030684\2300309040.pdf
\\Serv02\LOC6\Client\726C30\031274\2300429971.pdf
File2.txt will have the same information, however, I am needing to add a 1 right before the .pdf for each one (within file2.txt)
Example:
\\Serv02\LOC6\Client\726C30\032383\22000180231.pdf
I can easily update file2.txt using a RegEx statement, however it's only updating the contents based on that RegEx statement.
File2.txt will have a lot more data in it than file1.txt (more of the exact type of information). I am only needing to update file2.txt adding in the 1 right before .pdf BASED on what is in file1.txt
Here is the code I am using but as you can see it does not read file1.txt at all, I'm just using a RegEx statement to update file2.txt adding in the 1 before .pdf (the code below works to add in the 1 before .pdf, but I'm not iterating through file1.txt)
clear-host
set-location c:\temp
$File = "C:\Temp\file1.txt"
$FileZ = "C:\Temp\file2.txt"
$File2 = (Get-ChildItem $fileZ) | Select -ExpandProperty BaseName
$regex01 = '(\\Serv02\LOC6\Client\726C30\\d{1,6}\\d{1,10})(.pdf)$'
get-content $fileZ | % { $_ -replace $regex01, '${1}1${2}' -join "`r`n" } | out-file -Encoding default "c:\Temp\$File2.txt"
start-sleep -Seconds 2
$NewMRC = Get-ChildItem "$file2.txt" | Select -ExpandProperty Name
Get-ChildItem $NewMRC | rename-item -NewName {$_.Name -replace ".txt",".MRC2"}
If file1.txt had another line that didn't match up to the RegEx as shown above, file2.txt would not be updated with that line
\\Serv03\LOC7\Client\780D30\031456\8675309123.pdf
I hope I have explained this well enough. I'm not new to PowerShell but I am far from an expert. Any assistance is greatly appreciated.
I've modified your code as follows. The approach is read the content of File1.txt and store it in a variable. Then iterate on each line of File2.txt to check it against the regex as well as if that line is present in file1 content. If yes then replace it with whatever you want. Output this to a .tmp file in append mode. Once all the lines in File2.txt are processed, then replace it with .tmp file.
clear-host
set-location c:\temp
$File = "file1.txt"
$FileZ = "file2.txt"
# PS2
$File1 = get-content $File | Out-String
# PS3
# $File1 = get-content $File -Raw
$File2 = (Get-ChildItem $fileZ) | Select -ExpandProperty BaseName
if( test-path "$File2.tmp" ) { remove-item "$File2.tmp" }
$regex01 = '(\\\\Serv02\\LOC6\\Client\\726C30\\\d{1,6}\\\d{1,10})(.pdf)$'
get-content $fileZ |% {
$line = $_
$find = $line -replace '\\','\\'
if ( ($line -match $regex01) -AND ( $File1 -match $find ) ) {
$line -replace $regex01,'${1}1${2}' -join "`r`n"
} else {
$line
}
} | out-file "$File2.tmp" -append
remove-item "$File2.txt"
rename-item "$File2.tmp" "$File2.txt"
#start-sleep -Seconds 2
#$NewMRC = Get-ChildItem "$file2.txt" | Select -ExpandProperty Name
#Get-ChildItem $NewMRC | rename-item -NewName {$_.Name -replace ".txt",".MRC2"}
Notes:
The last 3 lines of your code doesn't seem to be related to your problem statement. So I've commented those lines.
$find = $line -replace '\\','\\': We are replacing single backslash \ with double backslash \\. But in the first parameter to -replace it must be escaped and in second param it must NOT be. So, even though they look same, they are interpreted differently.
One way to do this: Retrieve file content of first file into an array, then retrieve content of second file. For each line in second file: If first file's content has a line matching the current line, output modified line; otherwise, just output the current line.
$pattern = '(\\{2}(?:[^\\]+\\)+)([^\\\.]+)(\.pdf)'
$file1Content = Get-Content "file1.txt"
Get-Content "file2.txt" | ForEach-Object {
if ( $file1Content -contains $_ ) {
$_ | Select-String $pattern | ForEach-Object {
"{0}{1}1{2}" -f
$_.Matches[0].Groups[1].Value,
$_.Matches[0].Groups[2].Value,
$_.Matches[0].Groups[3].Value
}
}
else {
$_
}
}
First match group ($_.Matches[0].Groups[1].Value) is \\servername\sharename\path, second match group is filename without extension, and third match group is the file extension.

Powershell Remove Special Character(s) from Filenames

I am looking for a way to remove several special characters from filenames via a powershell script.
My filenames look like this:
[Report]_first_day_of_month_01_(generated_by_powershell)_[repnbr1].txt
I have been puzzling over removing the [] and everything between them, the () and everything between those, and removing all the _'s as well, with the desired result being a filename that looks like this:
first day of month 01.txt
Thus far, I have tried the below solution to no avail. I have run each of these from the directory in which the files reside.
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
$NewName2 = $NewName -Replace "[\s*]\[.*\]",''
$NewName3 = $NewName2 -Replace "_",' '
Rename-Item -Path $_ -NewName $NewName3
}
Since it does not work even if I try and do one set at a time like this:
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
Rename-Item -Path $_ -NewName $NewName
}
I assume there is an inherent flaw in the way I am trying to accomplish this task. That being said, I would prefer to use the Rename-Item cmdlet rather than using a move-item solution.
gci *.txt | Rename-Item -NewName {$_ -replace '_*(\[.*?\]|\(.*?\))_*' -replace '_+', ' '}
The rename is a regex which matches [text] or (text) blocks and replaces them with nothing. Parentheses and brackets need escaping in regexes to match them literally. It matches them with optional leading or trailing underscores to get [Report]_ or _[repnbr1] because it would leave _ at the start or end of the name and they would become leading/trailing spaces, which is annoying. Then it replaces remaining underscores with spaces.
See the regex working here: Regex101

How to use regular expression matching groups in batch renames?

I'm trying to do some regular expression based bulk renames with PowerShell.
This succesfully gives me only the files I need:
Get-ChildItem . | Where-Object { $_.Name -cmatch "(\b|_)(L|H|M|X{1,3})(_|\b)" }
(all those that contain an uppercase L, M, X, ...)
Next, I want to rename, i.e. mycustom_M.png to processed_M.png, another_L.png to processed_L.png, and so forth.
Basically, I would use the regexp .*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).* to select the names, and processed_\1.png to replace them if I was in Notepad++, but I can't get it to work in PowerShell (I'm surely missing the right syntax here):
[...] | Rename-Item -NewName { $_.Name -replace ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*","banner_$Matches.groups[1].value" }
Backreferences in PowerShell start with a $, not a \. However, you must either put the replacement expression in single quotes or escape the $, otherwise PowerShell would expand the $1 as a regular variable:
$pattern = ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*"
... | Rename-Item -NewName { $_.Name -replace $pattern, 'banner_$1' }
or
$pattern = ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*"
... | Rename-Item -NewName { $_.Name -replace $pattern, "banner_`$1" }

Find multiple lines spanning text and replace using PowerShell

I am using a regular expression search to match up and replace some text. The text can span multiple lines (may or may not have line breaks).
Currently I have this:
$regex = "\<\?php eval.*?\>"
Get-ChildItem -exclude *.bak | Where-Object {$_.Attributes -ne "Directory"} |ForEach-Object {
$text = [string]::Join("`n", (Get-Content $_))
$text -replace $RegEx ,"REPLACED"}
Try this:
$regex = New-Object Text.RegularExpressions.Regex "\<\?php eval.*?\>", ('singleline', 'multiline')
Get-ChildItem -exclude *.bak |
Where-Object {!$_.PsIsContainer} |
ForEach-Object {
$text = (Get-Content $_.FullName) -join "`n"
$regex.Replace($text, "REPLACED")
}
A regular expression is explicitly created via New-Object so that options can be passed in.
Try changing your regex pattern to:
"(?s)\<\?php eval.*?\>"
to get singleline (dot matches any char including line terminators). Since you aren't using the ^ or $ metacharacters I don't think you need to specify multiline (^ & $ match embedded line terminators).
Update: It seems that -replace makes sure the regex is case-insensitive so the i option isn't needed.
One should use the (.|\n)+ expression to cross line boundaries
since . doesn't match new lines.