Get regex working in powershell script - regex

I'm trying to rename several files using a regex expression.
ck1823000-23.dat
ck1293834-67.dat
lo1230324-99.dat
pk1232131-34.dat
...
I want to remove -XX
So the result would be like this:
ck1823000.dat
ck1293834.dat
lo1230324.dat
pk1232131.dat
...
I came up with this regex:
(?:.*?)([-\\s].*?).dat
But I get this error:
Rename-Item : The input to the script block for parameter 'NewName'
failed. The regular expression pattern is not valid
When I run this command:
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "\(?:.*?)([-\\s].*?).dat\", ""}

Use the below regex and then replace the matched characters with an empty string.
-[^.-]*(?=\\.dat)
DEMO
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "-[^.-]*(?=\\.dat)", ""}

Another option you can use basename instead of name property
Get-ChildItem . -file -Filter "*.dat" |
Rename-Item -newname { $_.basename -replace "-.*"}

Related

Regex to Exclude something and bulk rename files

I'm trying to rename all files of a directory, remove parts of files names excluding some parts.
for example:
Before --> After
file1: Something S01E01 Hello There Guys.srt --> S01E01.srt
file2: Something_else S03E22 Good.bye.srt --> S03E22.srt
etc.
I tried following code in powershell:
Get-ChildItem | rename-item -NewName {$_.name -replace "Something",""}
Get-ChildItem | rename-item -NewName {$_.name -replace "Good.bye",""}
Get-ChildItem | rename-item -NewName {$_.name -replace "Something_else",""}
Get-ChildItem | rename-item -NewName {$_.name -replace " Hello(.*?)\.srt",".srt"}
Get-ChildItem | rename-item -NewName {$_.name -replace " ",""}
Any idea about the right regex code instead of hardcoding to exclude "SxxExx.srt" Part of file name and remove the other parts of name?
You'll want to use a pattern like this:
(S\d{2}E\d{2})
to match and capture the S01E01 part.
Get-ChildItem | Rename-Item -NewName {($_.BaseName -replace '^.*(S\d{2}E\d{2}).*$','$1') + $_.Extension}
Maybe,
\b[A-Z][0-9]+[A-Z][0-9]+\b\s*|\.srt
or,
\b[A-Z][0-9]+[A-Z][0-9]+\b\s*|\.srt[^\r\n]*
or some similar expression being replaced with an empty string might be somewhat close.
Demo 1
Demo 2

Renaming files with a plus sign in PowerShell

When I've downloaded a bunch of files from dropbox, all Swedish character ä becomes +ñ. I'd like to replace this +ñ to ä.
My command is the following:
Get-ChildItem -Filter "*+ñ*" -Recurse | Rename-Item -NewName {$_.name -replace '"+ñ"','ä'}
But running this gets the follwing error message:
Rename-Item : The input to the script block for parameter 'NewName' failed. Invalid regular expression pattern: +ñ.
At line:1 char:60
+ Get-ChildItem -Filter "*+ñ*" -Recurse| Rename-Item -NewName <<<< {$_.name -replace $str1,"ä"}
+ CategoryInfo : InvalidArgument: (S+ñker.txt:PSObject) [Rename-Item], ParameterBindingException
+ FullyQualifiedErrorId : ScriptBlockArgumentInvocationFailed,Microsoft.PowerShell.Commands.RenameItemCommand
So I've boiled it down to the + character is the problem. How do I handle + and other types of characters that isn't automatically handled in PowerShell?
The -replace operator does a regex search. Since + is a quantifier you have to escape it using a backslash:
Get-ChildItem -Filter "*+ñ*" -Recurse | Rename-Item -NewName {$_.name -replace '"\+ñ"','ä'}
You could also use the non regex version:
Get-ChildItem -Filter "*+ñ*" -Recurse | Rename-Item -NewName {$_.name.replace('"+ñ"','ä')}

Regex is not working in powershell code, returns nothing

I have a problem with my regex, it is only selecting one error among four errors.
When I use this regex in my powershell code, it does not work. It is returning nothing.
My code is :
Get-ChildItem -Path '/Users/user/Documents/tmp' -filter '*.txt' | ForEach-Object {
$content = Get-Content $_.FullName
[regex]::Matches($content, "(ERROR\:[\S\s\n\r]*?\n)(C:)") | ForEach-Object {
$_.Groups[0].Value -replace '\r?\n'
}
}
My regex is:
https://regex101.com/r/kU9gR4/1
What is the problem in my regex and in my powershell code?

Powershell Remove Special Character(s) from Filenames

I am looking for a way to remove several special characters from filenames via a powershell script.
My filenames look like this:
[Report]_first_day_of_month_01_(generated_by_powershell)_[repnbr1].txt
I have been puzzling over removing the [] and everything between them, the () and everything between those, and removing all the _'s as well, with the desired result being a filename that looks like this:
first day of month 01.txt
Thus far, I have tried the below solution to no avail. I have run each of these from the directory in which the files reside.
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
$NewName2 = $NewName -Replace "[\s*]\[.*\]",''
$NewName3 = $NewName2 -Replace "_",' '
Rename-Item -Path $_ -NewName $NewName3
}
Since it does not work even if I try and do one set at a time like this:
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
Rename-Item -Path $_ -NewName $NewName
}
I assume there is an inherent flaw in the way I am trying to accomplish this task. That being said, I would prefer to use the Rename-Item cmdlet rather than using a move-item solution.
gci *.txt | Rename-Item -NewName {$_ -replace '_*(\[.*?\]|\(.*?\))_*' -replace '_+', ' '}
The rename is a regex which matches [text] or (text) blocks and replaces them with nothing. Parentheses and brackets need escaping in regexes to match them literally. It matches them with optional leading or trailing underscores to get [Report]_ or _[repnbr1] because it would leave _ at the start or end of the name and they would become leading/trailing spaces, which is annoying. Then it replaces remaining underscores with spaces.
See the regex working here: Regex101

Replace all but last instance of a character

I am writing a quick PowerShell script to replace all periods except the last instance.
EG:
hello. this. is a file.name.doc → hello this is a filename.doc
So far from another post I was able to get this regexp, but it does not work with PowerShell:
\.(?=[^.]*\.)
As per https://www.regex101.com/, it only matches the first occurrence of a period.
EDIT:
Basically I need to apply this match and replace to a directory with sub directories. So far I have this:
Get-ChildItem -Filter "*.*" | ForEach {
$_.BaseName.Replace('.','') + $_.Extension
}
But it does not actually replace the items, and I do not think it is recursive.
I tried a few variations:
Get-Item -Filter "*.*" -Recurse |
Rename-Item -NewName {$_.BaseName.Replace(".","")}
but I get the error message
source and destination path must be different
I had the PowerShell side of things working but was stuck on the RegEx part. I was able to match either all the "." or only the last "." which was part of the file extension. Then I found this post with the missing link: \.(?=[^.]*\.)
I added that to the rest of the PowerShell command and it worked perfectly.
Get-ChildItem -Recurse | Rename-Item -NewName {$_.Name -replace '\.(?=[^.]*\.)',' ' }
Exclude files that don't have a period in their basename from being renamed:
Get-ChildItem -File -Recurse | Where-Object { $_.BaseName -like '*.*' } |
Rename-Item -NewName {$_.BaseName.Replace('.', '') + $_.Extension}