How to use regular expression matching groups in batch renames? - regex

I'm trying to do some regular expression based bulk renames with PowerShell.
This succesfully gives me only the files I need:
Get-ChildItem . | Where-Object { $_.Name -cmatch "(\b|_)(L|H|M|X{1,3})(_|\b)" }
(all those that contain an uppercase L, M, X, ...)
Next, I want to rename, i.e. mycustom_M.png to processed_M.png, another_L.png to processed_L.png, and so forth.
Basically, I would use the regexp .*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).* to select the names, and processed_\1.png to replace them if I was in Notepad++, but I can't get it to work in PowerShell (I'm surely missing the right syntax here):
[...] | Rename-Item -NewName { $_.Name -replace ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*","banner_$Matches.groups[1].value" }

Backreferences in PowerShell start with a $, not a \. However, you must either put the replacement expression in single quotes or escape the $, otherwise PowerShell would expand the $1 as a regular variable:
$pattern = ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*"
... | Rename-Item -NewName { $_.Name -replace $pattern, 'banner_$1' }
or
$pattern = ".*?(?:\b|_)(L|H|M|X{1,3})(?:_|\b).*"
... | Rename-Item -NewName { $_.Name -replace $pattern, "banner_`$1" }

Related

How to match first characters with a powershell script

I am trying to move files to a certain folder if they start with a letter and delete them if they start with anything other than a letter.
My code:
Function moveOrDelete($source, $dest)
{
$aToZ = '^[a-zA-Z].*'
$notALetter = '^[^a-zA-Z].*'
Get-ChildItem -Path $source\$aToZ -Recurse | Move-Item -Destination $dest
Get-ChildItem -Path $source\$notALetter -Recurse | Remove-Item
}
As I understand it the caret will match on the first character when it's outside of the brackets. In other words, the regex in the $aToZ variable will match anything that begins with a letter. the .* part will allow the rest of the file name to be anything. The caret inside the brackets negates the statement so if the file name begins with anything other than a letter it will match. I can't get it to work and I'm not getting any errors which leads me to believe that my regex is wrong.
I have checked this with online tools including this one: https://regex101.com/ and they check out.
I have also used variations of the regex like ^[a-zA-Z] that don't work. Some patterns like [a-zA-Z]* move the files but it's not the pattern that I want.
Here is how I'm calling the funcion:
moveOrDelete ".\source" ".\dest"
And here are the sample file names I'm using:
a.txt
z.txt
1.txt
.txt
The -Path argument doesn't understand regular expressions, it takes a string and can perform wildcarding but not complex string processing.
So, you need to check the name of each file against the regex with the -match operator. The following should help:
Function moveOrDelete($source, $dest)
{
$aToZ = '^[a-zA-Z].*'
$notALetter = '^[^a-zA-Z].*'
Get-ChildItem -Path $source -Recurse | Where-Object { $_.name -match $aToZ } | Move-Item -Destination $dest
Get-ChildItem -Path $source -Recurse | Where-Object { $_.name -match $notALetter } | Remove-Item
}
Here, you need to filter the file names with the Where-Object cmdlet, then pipe to the move or remove.

Powershell Remove Special Character(s) from Filenames

I am looking for a way to remove several special characters from filenames via a powershell script.
My filenames look like this:
[Report]_first_day_of_month_01_(generated_by_powershell)_[repnbr1].txt
I have been puzzling over removing the [] and everything between them, the () and everything between those, and removing all the _'s as well, with the desired result being a filename that looks like this:
first day of month 01.txt
Thus far, I have tried the below solution to no avail. I have run each of these from the directory in which the files reside.
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
$NewName2 = $NewName -Replace "[\s*]\[.*\]",''
$NewName3 = $NewName2 -Replace "_",' '
Rename-Item -Path $_ -NewName $NewName3
}
Since it does not work even if I try and do one set at a time like this:
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
Rename-Item -Path $_ -NewName $NewName
}
I assume there is an inherent flaw in the way I am trying to accomplish this task. That being said, I would prefer to use the Rename-Item cmdlet rather than using a move-item solution.
gci *.txt | Rename-Item -NewName {$_ -replace '_*(\[.*?\]|\(.*?\))_*' -replace '_+', ' '}
The rename is a regex which matches [text] or (text) blocks and replaces them with nothing. Parentheses and brackets need escaping in regexes to match them literally. It matches them with optional leading or trailing underscores to get [Report]_ or _[repnbr1] because it would leave _ at the start or end of the name and they would become leading/trailing spaces, which is annoying. Then it replaces remaining underscores with spaces.
See the regex working here: Regex101

Get regex working in powershell script

I'm trying to rename several files using a regex expression.
ck1823000-23.dat
ck1293834-67.dat
lo1230324-99.dat
pk1232131-34.dat
...
I want to remove -XX
So the result would be like this:
ck1823000.dat
ck1293834.dat
lo1230324.dat
pk1232131.dat
...
I came up with this regex:
(?:.*?)([-\\s].*?).dat
But I get this error:
Rename-Item : The input to the script block for parameter 'NewName'
failed. The regular expression pattern is not valid
When I run this command:
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "\(?:.*?)([-\\s].*?).dat\", ""}
Use the below regex and then replace the matched characters with an empty string.
-[^.-]*(?=\\.dat)
DEMO
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "-[^.-]*(?=\\.dat)", ""}
Another option you can use basename instead of name property
Get-ChildItem . -file -Filter "*.dat" |
Rename-Item -newname { $_.basename -replace "-.*"}

Powershell regex group replacing

I want to replace some text in every script file in folder, and I'm trying to use this PS code:
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | ForEach-Object { (Get-Content $_.fullname) -replace $pattern, 'replace text' | Set-Content $_.fullname }
But I have no idea how to keep first part of expression, and just replace the second one. Any idea how can I do this? Thanks.
Not sure that provided regex for tables names is correct, but anyway you could replace with captures using variables $1, $2 and so on, and following syntax: 'Doe, John' -ireplace '(\w+), (\w+)', '$2 $1'
Note that the replacement pattern either needs to be in single quotes ('') or have the $ signs of the replacement group specifiers escaped ("`$2 `$1").
# may better replace with $pattern = '(FROM) (?<replacement_place>[a-zA-Z0-9_.]{1,7})'
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | % `
{
(Get-Content $_.fullname) | % `
{ $_-replace $pattern, '$1 replace text' } |
Set-Content $_.fullname -Force
}
If you need to reference other variables in your replacement expression (as you may), you can use a double-quoted string and escape the capture dollars with a backtick
{ $_-replace $pattern, "`$1 replacement text with $somePoshVariable" } |

Find multiple lines spanning text and replace using PowerShell

I am using a regular expression search to match up and replace some text. The text can span multiple lines (may or may not have line breaks).
Currently I have this:
$regex = "\<\?php eval.*?\>"
Get-ChildItem -exclude *.bak | Where-Object {$_.Attributes -ne "Directory"} |ForEach-Object {
$text = [string]::Join("`n", (Get-Content $_))
$text -replace $RegEx ,"REPLACED"}
Try this:
$regex = New-Object Text.RegularExpressions.Regex "\<\?php eval.*?\>", ('singleline', 'multiline')
Get-ChildItem -exclude *.bak |
Where-Object {!$_.PsIsContainer} |
ForEach-Object {
$text = (Get-Content $_.FullName) -join "`n"
$regex.Replace($text, "REPLACED")
}
A regular expression is explicitly created via New-Object so that options can be passed in.
Try changing your regex pattern to:
"(?s)\<\?php eval.*?\>"
to get singleline (dot matches any char including line terminators). Since you aren't using the ^ or $ metacharacters I don't think you need to specify multiline (^ & $ match embedded line terminators).
Update: It seems that -replace makes sure the regex is case-insensitive so the i option isn't needed.
One should use the (.|\n)+ expression to cross line boundaries
since . doesn't match new lines.