Fast filtering after Get-ChildItem - regex

With PowerShell 5.1 and the new .NET framework, it is finally possible to work with long path names using Get-ChildItem -Path "\\?\C:\MyPath".
In the code below we try to filter out files and folders that are in specific directories (as well as the directories themselves), because we're not interested in them.
This code works fine, but we were wondering if there is a better or faster way of doing this instead of building a long regex string?
In case the array $IgnoredFolders gets really long, it might take more time for -notmatch to make the decision. I'm not an expert in regex and how long they may become, so any feedback is welcome.
<#
# Path
Folder A
File folder A.txt
Folder B
File folder B.txt
Folder C
File folder C.txt
File root A.txt
File root B.txt
#>
$Path = 'S:\testFolder'
$IgnoredFolders = #(
"$Path\Folder B"
"$Path\Folder C"
)
$RegexIgnoredFolders = $IgnoredFolders.ForEach({
[Regex]::Escape("\\?\$_")
}) -join '|'
#(Get-ChildItem -LiteralPath "\\?\$Path" -Recurse).Where({
$_.FullName -notmatch $RegexIgnoredFolders
}) | select fullname
The output of this code is:
"$Path\Folder A"
"$Path\File root A.txt"
"$Path\File root B.txt"
"$Path\Folder A\File folder A.txt"

Related

Add leading zero to file names using PowerShell

I have a folder that contains multiple hunderd .mp3-files, all with the same name and an ascending number. The filenames look like this:
Test 01.mp3
Test 02.mp3
Test 03.mp3
Test 100.mp3
Test 101.mp3
Test 102.mp3
As you can see, the number of leading zeros in the first files is wrong, as they should have one more. I'd like to use PowerShell to solve the problem as I am currently learning to operate this quite helpful tool.
I tried to count the digits in the file names using the Replace Operator to filter out any non-digit characters. I assumed that the first 99 files would have three digits while the other files would have more (counting the '3' of the .mp3 file extension)
Get-Childitem | Where {($_.Name.Replace("\D","")).Length -le 3}
That should give me any files that have 3 or less digits in their file name - but it doesnt. In fact, it shows none. If i increase the number at he end to 11, i get the first three test files, increasing it to 12 shows all six of them. I assume that the Replace-Operator doesn't get applied to the file name before the filtering based on the Length-Operator, although I used brackets around $_.Name.Replace("\D","")
What the hell am I doing wrong?
I figures it out: Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} returns the files that I need to rename.
The whole command I used was
Get-ChildItem | Where {($_.Name -replace "\D","").Length -le 3} | Rename-Item -NewName { $_.Name -replace "Test ","Test 0"}
Its also possible to rename all files to an number-only scheamtic and use the padleft command as shown here
By replacing "Test " with "Test 0", you would still not achieve what you want on files that are numbered Test 1.mp3 (as this will become Test 01.mp3, which is one leading zero short).
You can make sure all files will have a 3-digit sequence number by doing this:
Get-Childitem -Path 'D:\Test' -Filter '*.mp3' -File |
Where-Object {$_.BaseName -match '(\D+)(\d+)$'} |
Rename-Item -NewName { '{0}{1:D3}{2}' -f $Matches[1], [int]$Matches[2], $_.Extension }
With this, also wrongly named files like Test 00000003.mp3 wil be renamed as Test 003.mp3

Comparing two text files is not working in Powershell

I am trying to compare the contents of two text files and have only the differences be outputted to the console.
The first text file is based on the file names in a folder.
$AsyFolder = Get-ChildItem -Path .\asy-data -Name
I then remove the prefix of the file name that is set up by the user and is the same for every file and is separated from the relevant info with a dash.
$AsyFolder| ForEach-Object{$_.Split("-").Replace("$Prefix", "")} | Where-Object {$_}|Set-Content -Path .\templog.txt
The output looks like $Asyfolder Output
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
odb.tgz
ccam.cad
read_me_asy.txt
There is another file that is the reference and contains the suffixes of files that should be there.
It looks like this Reference File
tpm.art
bpm.art
gbr.pdf
asy.pdf
fab.pdf
as1.art
as2.art
odb.tgz
xyp.txt
ccam.cad
And its contents are received with $AsyTemplate = Get-Content -Path C:\Users\asy_files.txt
The logic is as follows
$AsyTemplate |
ForEach-Object{
If(Select-String -Path .\templog.txt -Pattern $_ -NotMatch -Quiet){
Write-Host "$($_)"
}
}
I have tried various ways of setting up the templog.txt with -InputObject: using Get-Content, Get-Content -Raw, a variable, writing an array manually. I have also tried removing -NotMatch and using -eq $False for the output of select string.
Everytime though the output is just the contents of asy_files.txt (Reference File). It doesn't seem to care what is in templog.txt ($AsyFolder Output).
I have tried using compare-object/where-object method as well and it just says that both files are completely different.
Thank you #Lee_Dailey for your help in figuring out how to properly ask a question...
It ended up being additional whitespace (3 tabs) after the characters in the reference file asy_files.txt.
It was an artifact from where I copied from, and powershell was seeing "as2.art" and "as2.art " I am not 100% as to why that matters, but I found that sorting for any whitespace with /S that appears after a word character /W and removing it made the comparison logic work. The Compare-Object|Where-Object worked as well after removing the whitespace.

I'm trying to clean up a script I have by trying to make it build the folder structure based on the file name

I currently have a powershell script that I move files to specific folders based on the file names. The top of the script starts with setting a variables for the destination path where a certain group of files should go:
$FileName = "path to where files with that name go"
Then I read in the contents of the entire directory of files recursively into a variable:
$Files = Get-ChildItem $FileFolder -File -Recurse
Then I have a bunch of lines of the same command for matching and moving:
$Files | Where-Object { $_.Name -match 'some name' } | Move-Item -Destination "$Variable-set-above" -Force
It was fine when it was 10 or 20 matches, but with more and more files being added and needing to be organized, I want to see if I can clean up the script by having it build the destination folder structure based on the file name instead of having a line for every match case, and a line for every move.
I was looking into Split-Path, regex -split, String.split(), and some other options, and I think I'm close, but I can't find an example anywhere of where someone takes the first portion of the file name, up to a certain couple of characters, keeping the first part, and excluding the rest. Kind of like a Split-Ignoresecond or something like that.
I'm testing doing this first before modifying my main script, I have this so far:
3 files in a folder named Test.One.File.D0001.txt, Test.Two.File.D0001.txt, and Test.Three.File.D0001.txt.
My test script:
$Testfiles = Get-ChildItem -Name *.txt
$Testfiles.replace('.',' ') -split "D0"
Which gives me an output of:
Test One File
001 txt
Test Three File
001 txt
Test Two File
001 txt
It's weird that it's not in the right order, but I envision that I'd be just dealing with 1 file at a time anyway so that won't matter.
What I'd like to do is read in a file name, ignore the "001 txt" part, use the first part of the filename to build the last part of a destination path for the file move, and then move the file to that destination. I could use Split-Path -Leafbase but I can't figure out the syntax for it to not give me an error, and I'd still be left with part of the filename I don't want.
Say I have a file called One.Two.ThreeD0001 that needs to go to D:\Files\Onestwosthrees. I want my script to read in the files from a folder, and then process the file One.Two.ThreeD0001.txt so that all that's left is "One Two Three", stick it in a variable like $SplitFile, then move the file to a folder built from the filename like D:\Files\Onestwosthrees\$SplitFile.
There's further parsing I want to do, but if I can get this part down I can figure out the sub parsing I need.
Some sources I've looked at so far for clues are:
https://superuser.com/questions/817955
and
https://kevinmarquette.github.io/2017-07-31-Powershell-regex-regular-expression/
Think you were pretty much there:
cd "C:\Users\users\Downloads\StackTesting"
$testFiles = Get-Childitem -Include *.txt
foreach ( $item in $testfiles ) {
$directory = ($item.name.replace('.', '') -split "D0")[0]
## check if folder exists, if not create
if (!(Test-Path "C:\Users\user\Downloads\StackTesting\$directory"))
{
New-Item -Type Directory "C:\Users\users\Downloads\StackTesting\$directory"
}
ELSE
{
Write-Host "Folder exists"
}
## Move item to folder
Move-item $item.fullname -Destination "C:\Users\users\Downloads\StackTesting\$directory"
}
This is how i got your directory names:
$directory = ($item.name.replace('.', '') -split "D0")[0]
Changed from a space to no space as your examples at the bottom didn't have spaces.

How to robocopy using regex matched folder from one server to another

I've a build system which uses robocopy to copy files from one system to our server, and to a specific path. The following has worked well till a new requirement was introduced:
robocopy local\dist \\server01\somepath\dist XF *.* /E
Now, we want to have a changing 'dist' name to include build information. For example, 'dist1', 'dist2', 'distabcd'. Anyhow, the point is, that the folder name is changing. How do I tell robocopy to match on any name beginning with 'dist', but copy to the correct full named dist folder on the remote server
robocopy local\dist* \\server01\somepath\[????] XF *.* /E
I have the option to use PowerShell commands to do this, assuming it may be able to copy to the server location. I know almost nothing about PowerShell, but welcome any tips.
Powershell provides RegEx functionality with the '-match' and '-contains' operators. Here would be an example of what capturing changing directories would look like:
$localDirectory = "local\dist"
$directory = "\\server01\somepath\dist"
$keyword = "dist"
$fileDirectory = Get-ChildItem -Path $directory -Recurse
foreach ($container in $fileDirectory)
{
# -match is one of the RegEx functions we may utilize for this operation
# e.g dist1.. dist2.. distabc.. adist.. mynewdistrubition
if ($container -match $keyword)
{
try
{
Copy-Item -Path "$($directory)\$($container)" -Destination $localDirectory -Force -Recurse
}
catch [System.Exception]
{
Write-Output $_.Exception
}
}
}

PowerShell matching strings with Regex

I'm working on a script to move tv shows in to their corresponding folder on my drive. I'm having issues matching shows to their folders. This is the snippet of code I'm having a problem with:
#Remove all non-alphanumeric characters from the name
$newname = $Episode.Name -replace '[^0-9a-zA-Z ]', ' '
#Split the name at S01E01 and store the showname in a variable (Text before S01E01)
$ShowName = [regex]::Split($newname, 'S*(\d{1,2})(x|E)')[0]
#Match and get the destination folder where the names are similar
################## THIS IS WHERE THE ISSUE IS #######################
$DestDir = Gci -Path $DestinationRoot | Where { $ShowName -like "*$($_.Name)*" } | foreach {$_.Name }
For example, a show named "Doctor Who 2005 S02E02 Tooth and Claw.mp4" is not returning a similar folder, which is named "DoctorWho".
Question(s):
Who can I modify the $DestDir so that I can match the names? Is there a better way of doing this?
Working Code:
# Extract the name of the show (text before SxxExx)
$ShowName = [regex]::Split($Episode.Basename, '.(\d{1,3})(X|x|E|e)(\d{1,3})')[0]
# Assumption: There is a folder in TV shows directory that is named correctly, and the input file is named correctly
# Try to match by stripping all non-Alphabet characters from both names and check if the folder name contains the file name
$Folder = gci -Path $DestinationRoot |
Where {$_.PSisContainer -and `
(($_.Name -replace '[^A-Za-z]','') -match ($ShowName -replace '[^A-Za-z]','')) } |
select -ExpandProperty fullname
Some sample output from testing:
Input file name: Arrow S01E02.mp4
Show name: Arrow
Matching folder: C:\Users\Public\Videos\TV Shows\Arrow
-----------------------------------------------------------------------
Input file name: Big Bang Theory S3E03.avi
Show name: Big Bang Theory
Matching folder: C:\Users\Public\Videos\TV Shows\The Big Bang Theory
-----------------------------------------------------------------------
Input file name: Doctor Who S08E03.mp4
Show name: Doctor Who
Matching folder: C:\Users\Public\Videos\TV Shows\Doctor Who (2005)
-----------------------------------------------------------------------
Input file name: GameOfThronesS01E01.mp4
Show name: GameOfThrones
Matching folder: C:\Users\Public\Videos\TV Shows\Game Of Thrones
-----------------------------------------------------------------------
Using the same method as you to figure out what the show name is based on your suggestion. With Doctor Who 2005 S02E02 Tooth and Claw.mp4
$showName = $Episode -replace '[^0-9a-zA-Z ]'
$showName = ($showName -split ('S*(\d{1,2})(x|E)'))[0]
$showName = $showName -replace "\d"
I added the line $showName = $showName -replace "\d" to account for the year in the season. There is a caveat with this if the show contains a number in the middle of it but should work for most. Continuing to the $DestDir determination. Part of the issue is you have your Where comparison backwards. You want to see if the show name is part of the potential folder, not the other way around. Also since the potential folder could contain spaces the comaparison should also contain that assumption.
Get-ChildItem -Path $DestinationRoot -Directory | Where-Object { ($_.name -replace " ") -like "*$($showName)*"}
I would go on to use a Choice selection to have the user confirm the folder since it is possible to have multiple matches. I would like to point out that it might be hard to account for all naming conventions and variances but what you have is a good start.