Add leading zero to file names using PowerShell - regex

I have a folder that contains multiple hunderd .mp3-files, all with the same name and an ascending number. The filenames look like this:
Test 01.mp3
Test 02.mp3
Test 03.mp3
Test 100.mp3
Test 101.mp3
Test 102.mp3
As you can see, the number of leading zeros in the first files is wrong, as they should have one more. I'd like to use PowerShell to solve the problem as I am currently learning to operate this quite helpful tool.
I tried to count the digits in the file names using the Replace Operator to filter out any non-digit characters. I assumed that the first 99 files would have three digits while the other files would have more (counting the '3' of the .mp3 file extension)
Get-Childitem | Where {($_.Name.Replace("\D","")).Length -le 3}
That should give me any files that have 3 or less digits in their file name - but it doesnt. In fact, it shows none. If i increase the number at he end to 11, i get the first three test files, increasing it to 12 shows all six of them. I assume that the Replace-Operator doesn't get applied to the file name before the filtering based on the Length-Operator, although I used brackets around $_.Name.Replace("\D","")
What the hell am I doing wrong?

I figures it out: Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} returns the files that I need to rename.
The whole command I used was
Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} | Rename-Item -NewName { $_.Name -replace "Test ","Test 0"}
Its also possible to rename all files to an number-only scheamtic and use the padleft command as shown here

By replacing "Test " with "Test 0", you would still not achieve what you want on files that are numbered Test 1.mp3 (as this will become Test 01.mp3, which is one leading zero short).
You can make sure all files will have a 3-digit sequence number by doing this:
Get-Childitem -Path 'D:\Test' -Filter '*.mp3' -File |
Where-Object {$_.BaseName -match '(\D+)(\d+)$'} |
Rename-Item -NewName { '{0}{1:D3}{2}' -f $Matches[1], [int]$Matches[2], $_.Extension }
With this, also wrongly named files like Test 00000003.mp3 wil be renamed as Test 003.mp3

Related

How To Delete Multiple Files After Matching Through REGEX Using CMD/PowerShell In Windows?

I have a folder including sub-folders in my Windows PC where I have multiple files of images with different dimensions with standard formatted names as shown below.
first-image-name.jpg
first-image-name-72x72.jpg
first-image-name-150x150.jpg
first-image-name-250x250.jpg
first-image-name-300x300.jpg
first-image-name-400x400.jpg
first-image-name-1024x1024.jpg
second-image-name.png
second-image-name-72x72.png
second-image-name-150x150.png
second-image-name-250x250.png
second-image-name-300x300.png
second-image-name-400x400.png
second-image-name-1024x1024.png
Now I want to delete all those image files that are of different sizes as shown in their name and should leave the original one only.
For that, I tried many queries as shared below but non of these are working...
Windows PowerShell:
Get-ChildItem $Path | Where{$_.Name -Match '.*[0-9]+x[0-9]+.\(jpg\|png\|jpeg\)$'} | Remove-Item
Windows CMD:
find -type f -regex '.*[0-9]+x[0-9]+.\(jpg\|png\|jpeg\)$' -delete
find -name '.*[0-9]+x[0-9]+.\(jpg\|png\|jpeg\)$' -delete
None of the above is working so let me know what I am doing wrong...??? Please remember I have to use it as recursive as I have many folders inside the main folder too.
The fourth bird has provided the crucial pointer in a comment:
You mistakenly \-escaped the following regex metacharacters in your regex, which causes them to be matched as literals: (, ), and |. Simply omitting the \ would work.
Conversely, you neglected to escape . as \., given that you want it to be interpreted literally.
However, I suggest the following optimization, which pre-filters the files of interest and then matches only against each pre-filtered file's base name (.BaseName), i.e. the name without its extension:
Get-ChildItem -Recurse $Path -Include *.jpg, *.jpeg, *.png |
Where-Object { $_.BaseName -match '-[0-9]+x[0-9]+$' } |
Remove-Item -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Note:
The regex above needs no .* prefix, given that PowerShell's -match operator looks for substrings by default.
Using character class \d in lieu of [0-9] is an option, although \d technically also matches digits other than the ASCII-range characters 0 through 9, namely anything that the Unicode standard classifies as a digit.
While use of -Include, which (unlike -Filter) conveniently allows you to specify multiple PowerShell wildcard patterns, works as expected in combination with -Recurse, in the absence of -Recurse its behavior is counterintuitive:
See this answer for details.

Comparing two text files is not working in Powershell

I am trying to compare the contents of two text files and have only the differences be outputted to the console.
The first text file is based on the file names in a folder.
$AsyFolder = Get-ChildItem -Path .\asy-data -Name
I then remove the prefix of the file name that is set up by the user and is the same for every file and is separated from the relevant info with a dash.
$AsyFolder| ForEach-Object{$_.Split("-").Replace("$Prefix", "")} | Where-Object {$_}|Set-Content -Path .\templog.txt
The output looks like $Asyfolder Output
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
odb.tgz
ccam.cad
read_me_asy.txt
There is another file that is the reference and contains the suffixes of files that should be there.
It looks like this Reference File
tpm.art
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
as2.art
odb.tgz
xyp.txt
ccam.cad
And its contents are received with $AsyTemplate = Get-Content -Path C:\Users\asy_files.txt
The logic is as follows
$AsyTemplate |
ForEach-Object{
If(Select-String -Path .\templog.txt -Pattern $_ -NotMatch -Quiet){
Write-Host "$($_)"
}
}
I have tried various ways of setting up the templog.txt with -InputObject: using Get-Content, Get-Content -Raw, a variable, writing an array manually. I have also tried removing -NotMatch and using -eq $False for the output of select string.
Everytime though the output is just the contents of asy_files.txt (Reference File). It doesn't seem to care what is in templog.txt ($AsyFolder Output).
I have tried using compare-object/where-object method as well and it just says that both files are completely different.
Thank you #Lee_Dailey for your help in figuring out how to properly ask a question...
It ended up being additional whitespace (3 tabs) after the characters in the reference file asy_files.txt.
It was an artifact from where I copied from, and powershell was seeing "as2.art" and "as2.art " I am not 100% as to why that matters, but I found that sorting for any whitespace with /S that appears after a word character /W and removing it made the comparison logic work. The Compare-Object|Where-Object worked as well after removing the whitespace.

How to convert a string containing 2 numbers to currency with powershell?

I have text files that contain 2 numbers separated by a '+' sign. Trying to figure out how to replace them with currency equivalent .
Example Strings:
20+2 would be converted to $0.20+$0.02 USD
1379+121 would be> $13.79+$1.21 USD
400+20 would be $4.00+$0.20 USD
and so on.
I have tried using a few angles but they do not work or provide odd results.
I tried to do it here by attempting to find by all patterns I think would come up .
.\Replace-FileString.ps1 "100+10" '$1.00+$0.10' $path1\*.txt -Overwrite
.\Replace-FileString.ps1 "1000+100" '$10.00+$1.00' $path1\*.txt -Overwrite
.\Replace-FileString.ps1 "300+30" '$3.00+$0.30' $path1\*.txt -Overwrite
.\Replace-FileString.ps1 "400+20" '$4.00+$0.20' $path1\*.txt -Overwrite
or this which just doesn't work.
Select-String -Path .\*txt -Pattern '[0-9][0-9]?[0-9]?[0-9]?[0-9]?\+[0-9][0-9]?[0-9]?[0-9]?[0-9]?' | ForEach-Object {$_ -replace ", ", $"} {$_ -replace "+", "+$"}
I tried to do it here by attempting to find by all patterns I think would come up
Don't try this - we're humans, and we won't think of all edge cases and even if we did, the amount of code we needed to write (or generate) would be ridiculous.
We need a more general solution here, and regex might indeed be helpful with this.
The pattern you describe could be expressed as three distinct parts:
1 or more consecutive digits
1 plus sign (+)
1 or more consecutive digits
With this in mind, let's start to simplifying the regex pattern to use:
\b\d+\+\d+\b
or, written out with explanations:
\b # a word boundary
\d+ # 1 or more digits
\+ # 1 literal plus sign
\d+ # 1 or more digits
\b # a word boundary
Now, in order to transform an absolute value of cents into dollars, we'll need to capture the digits on either side of the +, so let's add capture groups:
\b(\d+)\+(\d+)\b
Now, in order to do anything interesting with the captured groups, we can utilize the Regex.Replace() method - it can take a scriptblock as its substitution argument:
$InputString = '1000+10'
$RegexPattern = '\b(\d+)\+(\d+)\b'
$Substitution = {
param($Match)
$Results = foreach($Amount in $Match.Groups[1,2].Value){
$Dollars = [Math]::Floor(($Amount / 100))
$Cents = $Amount % 100
'${0:0}.{1:00}' -f $Dollars,$Cents
}
return $Results -join '+'
}
In the scriptblock above, we expect the two capture groups ($Match.Groups[1,2]), calculate the amount of dollars and cents, and then finally use the -f string format operator to make sure that the cents value is always two digits wide.
To do the substitution, invoke the Replace() method:
[regex]::Replace($InputString,$RegexPattern,$Substitution)
And there you go!
Applying to to a bunch of files is as easy as:
$RegexPattern = '\b(\d+)\+(\d+)\b'
$Substitution = {
param($Match)
$Results = foreach($Amount in $Match.Groups[1,2].Value){
$Dollars = [Math]::Floor(($Amount / 100))
$Cents = $Amount % 100
'${0:0}.{1:00}' -f $Dollars,$Cents
}
return $Results -join '+'
}
foreach($file in Get-ChildItem $path *.txt){
$Lines = Get-Content $file.FullName
$Lines |ForEach-Object {
[regex]::Replace($_, $RegexPattern, $Substitution)
} |Set-Content $file.FullName
}
this regular expression work too
\b\d{3,4}(?=\+)|\d{2,3}(?=\")
https://regex101.com/
Do you want something like this output?
$20+$2 would be converted to $0.20+$0.02 USD
$1379+$121 would be> $13.79+$1.21 USD
$400+$20 would be $4.00+$0.20 USD
Then, you may try this command in powershell.
(gc test.txt) -replace '\b(\d+)\+(\d+)\b','$$$1+$$$2' | sc test.txt
gc , sc : alias for get-content, set-content commands respectively
\b(\d+)\+(\d+)\b : match the target string (numbers+numbers) and capturing numbers to $1, $2 in order
$$ : $ must be escaped to indicate literal $ dollor character (what you want to place in front of numbers)
$1, $2 : back-reference to the captured value
test.txt : contains your sample text
Of course, this is applicable for multiple files like follows
gci '*.txt' -recurse | foreach-object{(gc $_ ) '\b(\d+)\+(\d+)\b','$$$1+$$$2' | sc $_ }
gci : alias for get-childitem command. In default, it returns list in the present directory. If you want to change the directory, then must use -path option and -include option.
-recurse option : enables to search sub-directory
Edited
If you want capturing & dividing values & replacing old value with new one like follows
$0.2+$0.02 would be converted to $0.20+$0.02 USD
$13.79+$1.21 would be> $13.79+$1.21 USD
$4+$0.2 would be $4.00+$0.20 USD
then, you may try this.
gci *.txt -recurse | % {(gc $_) | % { $_ -match "\b(\d+)\+(\d+)\b" > $null; $num1=[int]$matches[1]/100; $num2=[int]$matches[2]/100; $dol='$$'; $_ -replace "\b(\d+)\+(\d+)\b","$dol$num1+$dol$num2"}|sc $_}
This command search files in the present directory and sub-directory. If you don't want to search in sub-directory, then remove -recurse option. And if you want another path, then use -path option and -include option like follows.
gci -path "your_path" -include *.txt | % {(gc $_) ...
Other solutions seem excessively complicated, first turning the string to values and then back to strings. Looking at the examples, it is just chopping up a string and re-assembling it while ensuring that the different parts (dollars and cents) have the correct lengths:
('20+2','1379+121','400+20') -replace
'(\d+)\+(\d+)','00$1+00$2' -replace
'0*(\d+)(\d\d)\+0*(\d+)(\d\d)','$$$1.$2+$$$3.$4 USD'
$0.20+$0.02 USD
$13.79+$1.21 USD
$4.00+$0.20 USD
Explanation:
Substitute all the + separated cent values with 0 padded values so there is a minimum of three digits, i.e. at least one digit in the dollars and exactly 2 for the cents.
Collect the individual dollars and cents for each value into distinct capture groups while simultaneously discarding any extraneous leading zeroes.
Re-substitute the (just padded) strings with the appropriately formatted versions.
It is interesting to note how the second substitution relies on the greedy nature of *. The 0* will match just as many leading zeroes as will still leave enough for the remainder of the pattern.
You can put in the word boundary anchor (\b), at one or both ends of the patterns, if you have parts of a line where there are digits separated by + which are directly adjacent to other text and you want them to be NOT processed, otherwise it is unnecessary.
Note: the example above shows an array of String as input and producing an array of String (each element displayed on a separate line). When -Replace is applied to an array, it enumerates the array, applies the replace to each element and collects each (possibly replaced) element into a result array. The output of Get-Content is an array of String (enumerated by PowerShell when supplying a pipeline). Similarly, the 'input' to Set-Content is an array of String (possibly converted from a general Object[] and/or collected from pipeline input). Thus, to convert a file just use:
(gc somefile) -replace ... -replace ... | sc newfile
# or even
sc newfile ((gc somefile) -replace ... -replace ...)
# Set-Content [-Path] String[] [-Value] Object[]
In the above, newfile and somefile can be the same due to a nice feature of Set-Content whereby it does not even open/create its output file(s) until it has something to write. Thus,
#() | sc existingfile
does not destroy existingfile. Note, however, that
sc existingfile #()
does destroy existingfile. This is because the first example sends nothing to Set-Content while the second example gives Set-Content something (an empty array). Since the output from Get-Content is collected into an (anonymous) array before -Replace is applied, there is no conflict between Get-Content and Set-Content over accessing the same file. The functionally equivalent version
gc somefile | foreach { $_ -replace ... -replace ... } | sc newfile
does not work if newfile is somefile since Set-Content receives each (possibly substituted) line from Get-Content before the next one is read meaning Set-Content can't open the file because Get-Content still has it open.
This is a separate answer because it doesn't explain how to achieve the desired result (already did that) but explains why the listed attempts do not work (an educational motive).
If you're using Replace-FileString.ps1 from GitHub then not only are the examples not a general solution, it won't work as listed above because Replace-FileString.ps1 uses the Replace method of a [regex] object so "400+20" matches "40" then 1 or more "0" then "20". Similarly for other attempts. Note, no "+" is matched in the patterns so all fail (unless you have lines like "40020+125" which matches on the 40020). Just as well, the replacement includes the capture group specifier "$0" (as part of '$1.00+$0.10') and other specifiers. There are no capture groups specified in the pattern so all the group specifiers would be taken literally, except "$0" being the entire match (if found). Thus, "40020+125" would be replaced by substituting '$4.00+$0.20' giving "$4.00+40020.20" ($4='$4' and $0='40020'). Probably, no matches are found. Result -> files not changed. (Phew!)
As for the Select-String attempt, Select-String would probably have matched the required data since the pattern matched up to 5 digits on either side of a +. This would send the matching lines (and ignored the rest, if any) into the ForEach-Object as [Microsoft.PowerShell.Commands.MatchInfo] objects (not strings). (Aside: this is a common mistake by a lot of PowerShell, um, novices. They assume that what they see on the screen is the same as what is churning about inside PowerShell. This is far from the truth and probably leads to most of the confusion amongst new users. PowerShell processes entire objects and typically displays only a summary of the most useful bits.) Anyway, I am unsure what the ForEach-Object is trying to achieve, not least due to the apparent typo. There is at least one " missing in the first script block and possibly a comma also. The best I can interpret it is
{ $_ -replace ", ",", $" }
i.e. change every ", " into ", $". This assumes that the strings to be substituted are all preceded by ", ". Note: lone $ is not an error because it cannot be interpreted as a variable substitution (no following name or {) or capture reference (no following group specifier [0-9`+'_&]). The next script block is clearer, change every "+" into "+$". Unfortunately, again, the first string is interpreted as a regular expression and, unlike the lone $, a lone + here is an error. It needs to be escaped with \. However, even with these errors corrected, there are two big problems:
The default output from Select-String is a collection of [MatchInfo] objects which when (implicitly) converted to String for use as the LHS of -replace include the file name and line number, thereby corrupting the lines from the file. To use just the line itself, specify $_.Line.
A completely incorrect usage of the scriptblock parameters to ForEach-Object. While it would seem that the intent was to perform two replace operations, placing them in individual scriptblocks is an error. Even if it worked, it would output 2 separate partial replacements instead of one completed replacement since $_ is not updated between the two expressions. ($_ is writable!)
ForEach-Object has 3 basic scriptblock groups, 1 -Begin block, 1 -End block and all the rest collectively as the -Process blocks. (The -Parallel block is not relevant here.) The documentation mentions a group called -RemainingScripts but this is actually just an implementation construct to allow the -Process scriptblocks to be specified as individual parameters rather than collected into an array (similar to parameter arrays in C# and VB). I suspect this was done so that users could simply drop the parameter names (-Begin, -Process and -End) and treat the scriptblocks as if they were positional parameters even though, strictly speaking, only -Process is positional and expects an array of scriptblocks (i.e. separated by commas). The introduction of -RemainingScripts in PS3.0 (with attribute ValueFromRemainingArguments so it behaves like a parameter array) was probably done to tidy up what might have been a nasty kludge to get the user friendly behaviour prior to PS3.0. Or maybe it was just formalising what was already going on.
Anyway, back on topic. By specifying multiple scriptblocks, the first is treated as -Begin and, if there are more than 2, the last is treated as -End. Thus, for two scriptblocks, the first is -Begin and the other is -Process. Therefore, even if the first scriptblock were syntactically correct, it would only run once and then still do nothing since $_ is not assigned (=$null) in -Begin. The correct way would be to place both replacements, joined into a single expression, in one scriptblock:
{ $_.Line -replace ", ",", $" -replace "\+","+$" }
Of course, this is just describing how to get it to "work". It is not the correct solution to the problem in the original post (see other answer).

Batch rename files with regex

I have a number of files with the following format:
name_name<number><number>[TIF<11 numbers>].jpg
e.g. john_sam01 [TIF 15355474840].jpg
And I would like to remove the [TIF 15355474840] from all of these files
This includes a leading space before the '[TIF...' and a different combination of 11 numbers each time.
So the previous example would become:
josh_sam01.jpg
In short, using powershell (or cmd.exe) with regex I would like to turn this filename:
josh_sam01 [TIF 15355474840].jpg
Into this:
josh_sam01.jpg
With variables being: 'john' 'sam' two numbers and the numbers after TIF.
Something like, with added newlines for clarity:
dir ‹parameters to select the set of files› |
% {
$newName = $_.Name -replace '\s\[TIF \d+\]',''
rename-item -newname $newName -literalPath $_.Fullname
}
Almost certainly adding -whatif to the rename until I was sure I had the file selection and rename correct.

How to Find Replace Multiple strings in multiple text files using Powershell

I am new to scripting, and Powershell. I have been doing some study lately and trying to build a script to find/replace text in a bunch of text files (Each text file having code, not more than 4000 lines). However, I would like to keep the FindString and ReplaceString as variables, for there are multiple values, which can in turn be read from a separate csv file.
I have come up with this code, which is functional, but I would like to know if this is the optimal solution for the aforementioned requirement. I would like to keep the FindString and ReplaceString as regular expression compatible in the script, as I would also like to Find/Replace patterns. (I am yet to test it with Regular Expression Pattern)
Sample contents of Input.csv: (Number of objects in csv may vary from 50 to 500)
FindString ReplaceString
AA1A 171PIT9931A
BB1B 171PIT9931B
CC1C 171PIT9931E
DD1D 171PIT9932A
EE1E 171PIT9932B
FF1F 171PIT9932E
GG1G 171PIT9933A
The Code
$Iteration = 0
$FDPATH = 'D:\opt\HMI\Gfilefind_rep'
#& 'D:\usr\fox\wp\bin\tools\fdf_g.exe' $FDPATH\*.fdf
$GraphicsList = Get-ChildItem -Path $FDPATH\*.g | ForEach-Object FullName
$FindReplaceList = Import-Csv -Path $FDPATH\Input.csv
foreach($Graphic in $Graphicslist){
Write-Host "Processing Find Replace on : $Graphic"
foreach($item in $FindReplaceList){
Get-Content $Graphic | ForEach-Object { $_ -replace "$($item.FindString)", "$($item.ReplaceString)" } | Set-Content ($Graphic+".tmp")
Remove-Item $Graphic
Rename-Item ($Graphic+".tmp") $Graphic
$Iteration = $Iteration +1
Write-Host "String Replace Completed for $($item.ReplaceString)"
}
}
I have gone through other posts here in Stackoverflow, and gathered valuable inputs, based on which the code was built. This post from Ivo Bosticky came pretty close to my requirement, but I had to perform the same on a nested foreach loop with Find/Replace Strings as Variables reading from an external source.
To summarize,
I would like to know if the above code can be optimized for
execution, since I feel it takes a long time to execute. (I prefer
not using aliases for now, as I am just starting out, and am fine
with a long and functional script rather than a concise one which is
hard to understand)
I would like to add the number of Iterations being carried out in
the loop. I was able to add the current Iteration number onto the
console, but couldn't figure how to pipe the output of
Measure-Command onto a variable, which could be used in Write-Host
Command. I would also like to display the time taken for code
execution, on completion.
Thanks for the time taken to read this Query. Much appreciate your support!
First of all, unless your replacement string is going to contain newlines (which would change the line boundaries), I would advise getting and setting each $Graphic file's contents only once, and doing all replacements in a single pass. This will also result in fewer file renames and deletions.
Second, it would be (probably marginally) faster to pass $item.FindString and $item.ReplaceString directly to the -replace operator rather than invoking the templating engine to inject the values into string literals.
Third, unless you truly need the output to go directly to the console instead of going to the normal output stream, I would avoid Write-Host. See Write-Host Considered Harmful.
And fourth, you might actually want to remove the Write-Host that gets called for every find and replace, as it may have a fair bit of effect on the overall execution time, depending on how many replacements there are.
You'd end up with something like this:
$timeTaken = (measure-command {
$Iteration = 0
$FDPATH = 'D:\opt\HMI\Gfilefind_rep'
#& 'D:\usr\fox\wp\bin\tools\fdf_g.exe' $FDPATH\*.fdf
$GraphicsList = Get-ChildItem -Path $FDPATH\*.g | ForEach-Object FullName
$FindReplaceList = Import-Csv -Path $FDPATH\Input.csv
foreach($Graphic in $Graphicslist){
Write-Output "Processing Find Replace on : $Graphic"
Get-Content $Graphic | ForEach-Object {
foreach($item in $FindReplaceList){
$_ = $_ -replace $item.FindString, $item.ReplaceString
}
$Iteration += 1
$_
} | Set-Content ($Graphic+".tmp")
Remove-Item $Graphic
Rename-Item ($Graphic+".tmp") $Graphic
}
}).TotalMilliseconds
I haven't tested it but it should run a fair bit faster, plus it will save the elapsed time to a variable.