Batch rename files with regex - regex

I have a number of files with the following format:
name_name<number><number>[TIF<11 numbers>].jpg
e.g. john_sam01 [TIF 15355474840].jpg
And I would like to remove the [TIF 15355474840] from all of these files
This includes a leading space before the '[TIF...' and a different combination of 11 numbers each time.
So the previous example would become:
josh_sam01.jpg
In short, using powershell (or cmd.exe) with regex I would like to turn this filename:
josh_sam01 [TIF 15355474840].jpg
Into this:
josh_sam01.jpg
With variables being: 'john' 'sam' two numbers and the numbers after TIF.

Something like, with added newlines for clarity:
dir ‹parameters to select the set of files› |
% {
$newName = $_.Name -replace '\s\[TIF \d+\]',''
rename-item -newname $newName -literalPath $_.Fullname
}
Almost certainly adding -whatif to the rename until I was sure I had the file selection and rename correct.

Related

Add leading zero to file names using PowerShell

I have a folder that contains multiple hunderd .mp3-files, all with the same name and an ascending number. The filenames look like this:
Test 01.mp3
Test 02.mp3
Test 03.mp3
Test 100.mp3
Test 101.mp3
Test 102.mp3
As you can see, the number of leading zeros in the first files is wrong, as they should have one more. I'd like to use PowerShell to solve the problem as I am currently learning to operate this quite helpful tool.
I tried to count the digits in the file names using the Replace Operator to filter out any non-digit characters. I assumed that the first 99 files would have three digits while the other files would have more (counting the '3' of the .mp3 file extension)
Get-Childitem | Where {($_.Name.Replace("\D","")).Length -le 3}
That should give me any files that have 3 or less digits in their file name - but it doesnt. In fact, it shows none. If i increase the number at he end to 11, i get the first three test files, increasing it to 12 shows all six of them. I assume that the Replace-Operator doesn't get applied to the file name before the filtering based on the Length-Operator, although I used brackets around $_.Name.Replace("\D","")
What the hell am I doing wrong?
I figures it out: Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} returns the files that I need to rename.
The whole command I used was
Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} | Rename-Item -NewName { $_.Name -replace "Test ","Test 0"}
Its also possible to rename all files to an number-only scheamtic and use the padleft command as shown here
By replacing "Test " with "Test 0", you would still not achieve what you want on files that are numbered Test 1.mp3 (as this will become Test 01.mp3, which is one leading zero short).
You can make sure all files will have a 3-digit sequence number by doing this:
Get-Childitem -Path 'D:\Test' -Filter '*.mp3' -File |
Where-Object {$_.BaseName -match '(\D+)(\d+)$'} |
Rename-Item -NewName { '{0}{1:D3}{2}' -f $Matches[1], [int]$Matches[2], $_.Extension }
With this, also wrongly named files like Test 00000003.mp3 wil be renamed as Test 003.mp3

Comparing two text files is not working in Powershell

I am trying to compare the contents of two text files and have only the differences be outputted to the console.
The first text file is based on the file names in a folder.
$AsyFolder = Get-ChildItem -Path .\asy-data -Name
I then remove the prefix of the file name that is set up by the user and is the same for every file and is separated from the relevant info with a dash.
$AsyFolder| ForEach-Object{$_.Split("-").Replace("$Prefix", "")} | Where-Object {$_}|Set-Content -Path .\templog.txt
The output looks like $Asyfolder Output
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
odb.tgz
ccam.cad
read_me_asy.txt
There is another file that is the reference and contains the suffixes of files that should be there.
It looks like this Reference File
tpm.art
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
as2.art
odb.tgz
xyp.txt
ccam.cad
And its contents are received with $AsyTemplate = Get-Content -Path C:\Users\asy_files.txt
The logic is as follows
$AsyTemplate |
ForEach-Object{
If(Select-String -Path .\templog.txt -Pattern $_ -NotMatch -Quiet){
Write-Host "$($_)"
}
}
I have tried various ways of setting up the templog.txt with -InputObject: using Get-Content, Get-Content -Raw, a variable, writing an array manually. I have also tried removing -NotMatch and using -eq $False for the output of select string.
Everytime though the output is just the contents of asy_files.txt (Reference File). It doesn't seem to care what is in templog.txt ($AsyFolder Output).
I have tried using compare-object/where-object method as well and it just says that both files are completely different.
Thank you #Lee_Dailey for your help in figuring out how to properly ask a question...
It ended up being additional whitespace (3 tabs) after the characters in the reference file asy_files.txt.
It was an artifact from where I copied from, and powershell was seeing "as2.art" and "as2.art " I am not 100% as to why that matters, but I found that sorting for any whitespace with /S that appears after a word character /W and removing it made the comparison logic work. The Compare-Object|Where-Object worked as well after removing the whitespace.

PowerShell matching strings with Regex

I'm working on a script to move tv shows in to their corresponding folder on my drive. I'm having issues matching shows to their folders. This is the snippet of code I'm having a problem with:
#Remove all non-alphanumeric characters from the name
$newname = $Episode.Name -replace '[^0-9a-zA-Z ]', ' '
#Split the name at S01E01 and store the showname in a variable (Text before S01E01)
$ShowName = [regex]::Split($newname, 'S*(\d{1,2})(x|E)')[0]
#Match and get the destination folder where the names are similar
################## THIS IS WHERE THE ISSUE IS #######################
$DestDir = Gci -Path $DestinationRoot | Where { $ShowName -like "*$($_.Name)*" } | foreach {$_.Name }
For example, a show named "Doctor Who 2005 S02E02 Tooth and Claw.mp4" is not returning a similar folder, which is named "DoctorWho".
Question(s):
Who can I modify the $DestDir so that I can match the names? Is there a better way of doing this?
Working Code:
# Extract the name of the show (text before SxxExx)
$ShowName = [regex]::Split($Episode.Basename, '.(\d{1,3})(X|x|E|e)(\d{1,3})')[0]
# Assumption: There is a folder in TV shows directory that is named correctly, and the input file is named correctly
# Try to match by stripping all non-Alphabet characters from both names and check if the folder name contains the file name
$Folder = gci -Path $DestinationRoot |
Where {$_.PSisContainer -and `
(($_.Name -replace '[^A-Za-z]','') -match ($ShowName -replace '[^A-Za-z]','')) } |
select -ExpandProperty fullname
Some sample output from testing:
Input file name: Arrow S01E02.mp4
Show name: Arrow
Matching folder: C:\Users\Public\Videos\TV Shows\Arrow
-----------------------------------------------------------------------
Input file name: Big Bang Theory S3E03.avi
Show name: Big Bang Theory
Matching folder: C:\Users\Public\Videos\TV Shows\The Big Bang Theory
-----------------------------------------------------------------------
Input file name: Doctor Who S08E03.mp4
Show name: Doctor Who
Matching folder: C:\Users\Public\Videos\TV Shows\Doctor Who (2005)
-----------------------------------------------------------------------
Input file name: GameOfThronesS01E01.mp4
Show name: GameOfThrones
Matching folder: C:\Users\Public\Videos\TV Shows\Game Of Thrones
-----------------------------------------------------------------------
Using the same method as you to figure out what the show name is based on your suggestion. With Doctor Who 2005 S02E02 Tooth and Claw.mp4
$showName = $Episode -replace '[^0-9a-zA-Z ]'
$showName = ($showName -split ('S*(\d{1,2})(x|E)'))[0]
$showName = $showName -replace "\d"
I added the line $showName = $showName -replace "\d" to account for the year in the season. There is a caveat with this if the show contains a number in the middle of it but should work for most. Continuing to the $DestDir determination. Part of the issue is you have your Where comparison backwards. You want to see if the show name is part of the potential folder, not the other way around. Also since the potential folder could contain spaces the comaparison should also contain that assumption.
Get-ChildItem -Path $DestinationRoot -Directory | Where-Object { ($_.name -replace " ") -like "*$($showName)*"}
I would go on to use a Choice selection to have the user confirm the folder since it is possible to have multiple matches. I would like to point out that it might be hard to account for all naming conventions and variances but what you have is a good start.

How to Find Replace Multiple strings in multiple text files using Powershell

I am new to scripting, and Powershell. I have been doing some study lately and trying to build a script to find/replace text in a bunch of text files (Each text file having code, not more than 4000 lines). However, I would like to keep the FindString and ReplaceString as variables, for there are multiple values, which can in turn be read from a separate csv file.
I have come up with this code, which is functional, but I would like to know if this is the optimal solution for the aforementioned requirement. I would like to keep the FindString and ReplaceString as regular expression compatible in the script, as I would also like to Find/Replace patterns. (I am yet to test it with Regular Expression Pattern)
Sample contents of Input.csv: (Number of objects in csv may vary from 50 to 500)
FindString ReplaceString
AA1A 171PIT9931A
BB1B 171PIT9931B
CC1C 171PIT9931E
DD1D 171PIT9932A
EE1E 171PIT9932B
FF1F 171PIT9932E
GG1G 171PIT9933A
The Code
$Iteration = 0
$FDPATH = 'D:\opt\HMI\Gfilefind_rep'
#& 'D:\usr\fox\wp\bin\tools\fdf_g.exe' $FDPATH\*.fdf
$GraphicsList = Get-ChildItem -Path $FDPATH\*.g | ForEach-Object FullName
$FindReplaceList = Import-Csv -Path $FDPATH\Input.csv
foreach($Graphic in $Graphicslist){
Write-Host "Processing Find Replace on : $Graphic"
foreach($item in $FindReplaceList){
Get-Content $Graphic | ForEach-Object { $_ -replace "$($item.FindString)", "$($item.ReplaceString)" } | Set-Content ($Graphic+".tmp")
Remove-Item $Graphic
Rename-Item ($Graphic+".tmp") $Graphic
$Iteration = $Iteration +1
Write-Host "String Replace Completed for $($item.ReplaceString)"
}
}
I have gone through other posts here in Stackoverflow, and gathered valuable inputs, based on which the code was built. This post from Ivo Bosticky came pretty close to my requirement, but I had to perform the same on a nested foreach loop with Find/Replace Strings as Variables reading from an external source.
To summarize,
I would like to know if the above code can be optimized for
execution, since I feel it takes a long time to execute. (I prefer
not using aliases for now, as I am just starting out, and am fine
with a long and functional script rather than a concise one which is
hard to understand)
I would like to add the number of Iterations being carried out in
the loop. I was able to add the current Iteration number onto the
console, but couldn't figure how to pipe the output of
Measure-Command onto a variable, which could be used in Write-Host
Command. I would also like to display the time taken for code
execution, on completion.
Thanks for the time taken to read this Query. Much appreciate your support!
First of all, unless your replacement string is going to contain newlines (which would change the line boundaries), I would advise getting and setting each $Graphic file's contents only once, and doing all replacements in a single pass. This will also result in fewer file renames and deletions.
Second, it would be (probably marginally) faster to pass $item.FindString and $item.ReplaceString directly to the -replace operator rather than invoking the templating engine to inject the values into string literals.
Third, unless you truly need the output to go directly to the console instead of going to the normal output stream, I would avoid Write-Host. See Write-Host Considered Harmful.
And fourth, you might actually want to remove the Write-Host that gets called for every find and replace, as it may have a fair bit of effect on the overall execution time, depending on how many replacements there are.
You'd end up with something like this:
$timeTaken = (measure-command {
$Iteration = 0
$FDPATH = 'D:\opt\HMI\Gfilefind_rep'
#& 'D:\usr\fox\wp\bin\tools\fdf_g.exe' $FDPATH\*.fdf
$GraphicsList = Get-ChildItem -Path $FDPATH\*.g | ForEach-Object FullName
$FindReplaceList = Import-Csv -Path $FDPATH\Input.csv
foreach($Graphic in $Graphicslist){
Write-Output "Processing Find Replace on : $Graphic"
Get-Content $Graphic | ForEach-Object {
foreach($item in $FindReplaceList){
$_ = $_ -replace $item.FindString, $item.ReplaceString
}
$Iteration += 1
$_
} | Set-Content ($Graphic+".tmp")
Remove-Item $Graphic
Rename-Item ($Graphic+".tmp") $Graphic
}
}).TotalMilliseconds
I haven't tested it but it should run a fair bit faster, plus it will save the elapsed time to a variable.

What is the regex for replacing "123456-name" pattern?

I am trying to rename multiple files in windows using powershell.
And I want to rename replacing this pattern:
"123456-the_other_part_of_the_string".
Example:
409873-doc1.txt
378234-doc2.txt
1230-doc3.txt
Basically I want to crop the numbers + '-' thing.
$variable -replace "^\d+-", ""
Get-ChildItem . *.txt | Where {$_.Name -match '^\d+-(.*)'} |
Rename-Item -NewName {$matches[1]}
or with aliases:
gci . *.txt | ?{$_.Name -match '^\d+-(.*)'} | rni -new {$matches[1]}
[0-9]*?-[^.]*
I'd recommend you take some time to learn regex, though, instead of just using answers from SO. You will run into all sorts of unusual file naming that may throw your program off in the real world and you woln't be able to fix these issues without understanding regex.
EDIT: Not sure if you also want to remove the 'name' part. If not, use this instead:
[0-9]*?-