Powershell Exclusion of Folders with matching filenames - regex

I'm using something like the below script to exclude Folders.
The problem is that there is a file name as of a folder name and i only want to exclude the folder and not the file. For example in the below i want to exclude "B" Folder only and not "B.txt" file whereas the current code as shown below excludes both file and folder.
$exclude = #("*.cer")
$excludeMatch = #("Member", "A", "B" , "C" , "D")
[regex] $excludeMatchRegEx = ‘(?i)‘ + (($excludeMatch |foreach {[regex]::escape($_)}) –join “|”) + ‘’
Get-ChildItem -Path $source -Recurse -Exclude $exclude |
where { $excludeMatch -eq $null -or `enter code here`$_.FullName.Replace($Source, "") -notmatch $excludeMatchRegEx} |
Copy-Item -Destination {
if ($_.PSIsContainer) {
Join-Path $Dest `enter code here`$_.Parent.FullName.Substring($source.length)
} else {
Join-Path $Dest $_.FullName.Substring($source.length)
}
} -Force -Exclude $exclude

If all you want to do is exclude folders, it's pretty easy to do:
Get-ChildItem -Path \Temp -Recurse | Where-Object {$_.PSIsContainer -eq $false}
If you want to compound this using a regex, you could also add that to your Where-Object statement:
Get-ChildItem -Path \Temp -Recurse | Where-Object {$_.PSIsContainer -eq $false -and $_.Name -match $regexPattern }
You also throw a -notmatch in there if you feel you DON'T want certain things.
One thing to keep in mind with PowerShell... In my experience, the pipeline is GREAT for executing commands from the shell, but that's generally when you're doing things that are very well-defined in your personal command dictionary (things like Get-ADUser -Filter {GivenName -eq "Sam"}, but it isn't as good when you're trying to do script-y sort of things -- in that case, you're really best doing filtering with Where-Object, and selecting down to the items that you need. PowerShell has a pretty reasonably featured debugger for if you want a good debugging experience, and you can't really step into the pipeline as it stands today to evaluate how something works. Especially if you're having problems with a script or a series of commands, the debugger can be your absolute best friend, and I'd highly recommend breaking things out into individual statements to do some analyses there.
Also, the Get-Member cmdlet is quite arguably one of the top-ten most useful cmdlets in Windows PowerShell (along with Get-Help, Where-Object, Select-Object, Get-Command, and a few others). It really helps when you're starting to evaluate how a script is going to function to analyze the properties of the objects you're working with (in your case, System.IO.FileInfo and System.IO.DirectoryInfo), to help reduce the amount of scratching your head later.
I hope this helps!
Edit:
After reading the comments, I am amending my answer to better fit the problem description.
If you do not wish to preserve the source-folder hierarchy, then you can just run Get-ChildItem -Recurse | Where-Object { $_.PSIsContainer -eq $false -and $_.Name -match $regexPattern } | Copy-Item -Destination **target_folder**. Copy-Item should simply treat the output as all belonging to one location. Note that you can also run Copy-Item (and most other cmdlets that modify something support the -WhatIf parameter. You'll want to verify the output before running it.

Related

Powershell Rename dynamic filenames containing square brackets, from the filetype scans in the directory

I don't much know(in details and specifics) about Powershell's silly and ridiculous issues/bugs in handling square brackets(just because it escapes strings multiple times internally) in the path strings, where I have to use Regex with asterisk(*) to match/catch the patterns.
I did heavy Googling and found that there's method [WildcardPattern]::Escape($Filename) that could help me Rename-Item such dynamic file paths, I thought the below code would work with such dynamic paths which are result of file-type scans in the current folder, but disappointingly, it doesn't:
Set-Location "$PSScriptRoot"
$MkvFiles = Get-ChildItem -Filter *.mkv -Path $Path
Foreach ($MkvFile in $MkvFiles) {
$MkvOrigName = [WildcardPattern]::Escape($MkvFile.Name)
$MkvOrigFullname = [WildcardPattern]::Escape($MkvFile.FullName)
If ($MkvOrigName -Match '.*(S[0-9]{2}E[0-9]{2}).*') {
$NewNameNoExt = $MkvOrigFullname -Replace '.*(S[0-9]{2}E[0-9]{2}).*', '$1'
$NewName = "$NewNameNoExt.mkv"
Rename-Item $MkvOrigFullname -NewName $NewName
}
}
I am getting the following error with Rename-Item command when I run the above script on the folder that contains the files such as given at the end of question:
Rename-Item : An object at the specified path C:\Users\Username\Downloads\WebseriesName Season
4\WebSeriesName.2016.S04E13.iNTERNAL.480p.x264-mSD`[eztv`].mkv does not exist.
At C:\Users\Username\Downloads\WebseriesName Season 4\BulkFileRenamerFinalv1.ps1:12 char:9
+ Rename-Item $MkvOrigFullname -NewName $NewName
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The Webseries file paths in the current folder, that I am dealing with are like these:
WebSeriesName.2016.S04E01.HDTV.x264-SVA[eztv].mkv
WebSeriesName.2016.S04E02.HDTV.x264-SVA[eztv].mkv
....
....
WebSeriesName.2016.S04E12.iNTERNAL.480p.x264-mSD[eztv].mkv
WebSeriesName.2016.S04E13.iNTERNAL.480p.x264-mSD[eztv].mkv
Someone could help me figuring out this problem generically without need to headbang with what the filenames strings contain, as long as they contain the string like S04E01,S04E02 etc. and surely contain square brackets ? That is, how can I escape the square brackets and rename them, as apparent in the code afore-mentioned, to the names given below ?
S04E01.mkv
S04E02.mkv
....
....
S04E12.mkv
S04E13.mkv
If you use the pipeline, you don't need to worry about escaping paths. This is because PSPath property will automatically bind to the -LiteralPath parameter on Rename-Item.
Set-Location "$PSScriptRoot"
$MkvFiles = Get-ChildItem -Filter *.mkv -Path $Path
Foreach ($MkvFile in $MkvFiles) {
If ($MkvFile.Name -Match '.*(S[0-9]{2}E[0-9]{2}).*') {
$MkvFile | Rename-Item -NewName {"{0}{1}" -f $matches.1,$_.Extension}
}
}
Explanation:
The -NewName parameter supports delay-bind scripting. So we can use a script block to do the property/string manipulation.
If wildcards are not needed for the path query, then using -LiteralPath is the best approach. The -LiteralPath value is bound exactly as typed (literal/verbatim string). -Path for Get-ChildItem accepts wildcards, but -Path for Rename-Item does not support wildcards. Yet it seems like PowerShell still cares when parsing the command. If you must escape some wildcard characters in a -Path parameter that accepts wildcards, then double quoted paths require 4 backticks and single quoted paths require 2 backticks. This is because two levels of escape are required.
When using -match against a single string even if in a conditional statement, the $matches automatic variable is updated when a match is successful. Capture group matches are accessed using syntax $matches.capturegroupname or $matches[capturegroupname]. Since you did not name the capture group, it was automatically named 1 by the system. A second set of () around a capturing group, would have been 2. It is important to remember that when -match is False, $matches is not updated from its previous value.
Examples of handling wildcard characters in -Path parameters that support wildcards:
# Using double quotes in the path
$Path = "WebSeriesName.2016.S04E01.HDTV.x264-SVA````[eztv].mkv"
Get-ChildItem -Path $Path
# Using single quotes in the path
$Path = "WebSeriesName.2016.S04E01.HDTV.x264-SVA``[eztv].mkv"
Get-ChildItem -Path $Path
# Using LiteralPath
$Path = "WebSeriesName.2016.S04E01.HDTV.x264-SVA[eztv].mkv"
Get-ChildItem -LiteralPath $Path
Rename-Item -LiteralPath $Path -NewName 'MyNewName.mkv'
# Using WildcardPattern Escape method
$Path = 'WebSeriesName.2016.S04E01.HDTV.x264-SVA[eztv].mkv'
$EscapedPath = ([WildcardPattern]::Escape([WildcardPattern]::escape($path)))
Get-ChildItem -Path $EscapedPath

Get child items with date filetypes using regex in Powershell

I am dealing with many files that have a file extension such as .20180615 (yyyyMMDD). I am looking for a way to access all files with a date filetype using regex. The only solution I have right now is to use *2 to get all files with a filetype starting with 2, but I would prefer a solution that used regex to generalize across all dates in the yyyyMMDD format.
Get-ChildItem -Path $path -Recurse *2
There are many solutions to this issue. Most choices could just be based on personal preference but there could be performance and validation limitations concerns.
When dealing with files it is always more suitable to try and use the built-in filtering as opposed to post-processing with Where-Object and the like. It affects performance especially when -recurse is involved. The limitation is basic wildcard support for Get-ChildItem -Path. So, with that in mind, LotPings comment covers that solution.
Get-ChildItem -Path "$path\*.20[01][0-9][01][0-9][0-3][0-9]" -Recurse
You should be covered until the year 2099!
You tagged regex so a simple Where-Object filter for that would just be
Get-ChildItem -Path $path -Recurse | Where-Object{$_.Extension -match '^\d{8}$'}
Lacks some validation but if the extension is 8 digits you are good to go.
Another approach that validates your criteria better would be to only allow files with valid dates
Get-ChildItem -Path $path -Recurse | Where-Object{
try{
[datetime]::ParseExact($_.Extension,"yyyyMMdd",[System.Globalization.CultureInfo]::InvariantCulture)
} catch {
$false
}
} | ForEach-Object{
# Do stuff
}
If there is a valid date in the extension that file (or folder) will get passed down the pipeline. That being said if you have at least psv3 consider the -File switch for Get-ChildItem.

Powershell: unexpected behavior of negated -like and -match conditionals

I have 2 folders in my windows folder, software, and softwaretest.
So I have the main folder "software" if statement, then jump to the elseif - here I have the backup folder, so jump to the else...
my problem is that I'm getting the write-host from the elseif, and I have a backup folder that I'm calling softwaretest, so can't see why it give me that output and not the else.
hope someone can guide/help me :-)
If ($SoftwarePathBackup = Get-ChildItem -Path "$Env:SystemRoot" | Where-Object { (!$_.Name -like 'software') }) {
Write-Host ( 'There are no folder named \software\ on this machine - You cant clean/clear/empty the folder!' ) -ForegroundColor Red;
} elseif ($SoftwarePathBackup = Get-ChildItem -Path "$Env:SystemRoot" | Where-Object { ($_.Name -match '.+software$|^software.+') } | Sort-Object) {
Write-Host ( 'There are none folder-backups of \software\ on this machine - You need to make a folder-backup of \software\ before you can clean/clear/empty the folder!' ) -ForegroundColor Red;
} else {
Remove-Item
}
I find it very confusing, to have the negation on the right or even in the RegEx. I think it would be more obvious, to negate in the beginning with a ! or -not.
To test, if a folder exist, you can use Test-Path. Test-Path also has a -Filter parameter, which you can use instead of Where-Object. But I think you don't even have to filter.
$SoftwarePath = "$($Env:SystemRoot)\Software", "$($Env:SystemRoot)\SoftwareBackup"
foreach ($Path in $SoftwarePath) {
if (Test-Path -Path $Path) {
Remove-Item -Path $Path -Force -Verbose
}
else {
Write-Output "$Path not found."
}
}
Would that work for you?
Your primary problem is one of operator precedence:
!$_.Name -like 'software' should be ! ($_.Name -like 'software') or, preferably,
$_.Name -notlike 'software' - using PowerShell's not-prefixed operators for negation.
Similarly, you probably meant to negate $_.Name -match '.+software$|^software.+' which is most easily achieved with $_.Name -notmatch '.+software$|^software.+'
As stated in Get-Help about_Operator_Precedence, ! (a.k.a. -not) has higher precedence than -like, so !$_.Name -like 'software' is evaluated as (!$_.Name) -like 'software', which means that the result of !$_.Name - a Boolean - is (string-)compared to wildcard pattern 'software', which always returns $False, so the If branch is never entered.
That said, you can make do without -like and -match altogether and use the implicit wildcard matching supported by Get-Item's -Include parameter (snippet requires PSv3+):
# Get folders whose name either starts with or ends with 'software', including
# just 'software' itself.
$folders = Get-Item -Path $env:SystemRoot\* -Include 'software*', '*software' |
Where-Object PSIsContainer
# See if a folder named exactly 'software' is among the matches.
$haveOriginal = $folders.Name -contains 'software'
# See if there are backup folders among the matches (too).
# Note that [int] $haveOriginal evaluates to 1 if $haveOriginal is $True,
# and to 0 otherwise.
$haveBackups = ($folders.Count - [int] $haveOriginal) -gt 0
# Now act on $folders as desired, based on flags $haveOriginal and $haveBackups.
Note how Get-Item -Path $env:SystemRoot\* is used to explicitly preselect all items (add -Force if hidden items should be included too), which are then filtered down via -Include.
Since Get-Item - unlike Get-ChildItem- doesn't support -Directory, | Where-Object PSIsContainer is used to further limit the matches to directories (folders).
Note: Get-ChildItem was not used, because -Include only takes effect on child (descendant) items (too) when -Recurse is also specified; while -Recurse can be combined with -Depth 0 (PSv3+) in order to limit matching to immediate child directories, Get-ChildItem apparently still tries to read the entries of all child directories as well, which can result in unwanted access-denied errors from directories that aren't even of interest.
In other words: Get-ChildItem -Recurse -Depth 0 -Directory $env:SystemRoot -include 'software*', '*software' is only equivalent if you have (at least) read access to all child directories of $env:SystemRoot.

Trying to create a power shell script that removes text in a filename between two brackets using regex

I am trying to write a script to take a file name and remove any pair of brackets and the text between them from the string
get-childItem *.* -recurse |
foreach-object {$_ -replace '\(([^\)]+)\)', ''}
this will output a list of new values for every file in the folder to the prompt as it should look, however what I can't seem to find is a way to set the new values as the filenames, the plan is to do this for multiple files in a folder with the format "name(Randomnumbers).ext"
Any help is appreciated
From my understanding of your question, you want to rename each with the names contained within the parenthesis. To accomplish that, you can use the $Matches variable that is written by the -match operator. I'm also assuming you want to maintain the file extension.
Get-ChildItem -Recurse | ForEach-Object {
if ($_ -match '(?<name>.*)(?:\([^\)]+\))(?<ext>.*)') {
Rename-Item $_ "$($matches['name'])$($matches['ext'])"
}
}

Powershell 'where' statement -notcontains

I have a simple excerpt form a larger script, basically I'm trying to do a recursive file search, including sub-directories (and any child of the exclude).
clear
$Exclude = "T:\temp\Archive\cst"
$list = Get-ChildItem -Path T:\temp\Archive -Recurse -Directory
$list | where {$_.fullname -notlike $Exclude} | ForEach-Object {
Write-Host "--------------------------------------"
$_.fullname
Write-Host "--------------------------------------"
$files = Get-ChildItem -Path $_.fullname -File
$files.count
}
At the moment this script will exclude the T:\temp\Archive\cst directory, but not the T:\temp\Archive\cst\artwork directory. I'm struggling to overcome this simple thing.
I've tried the -notlike (which I didn't really expect to work) but also the -notcontains which I was hopeful of.
Can anyone offer any advice, I'm thinking it would require a regex match which I'm reading up on now, but not very familiar with.
In the future the $exclude variable will be an array of strings (directories) but at the moment just trying to get it to work with a straight string.
Try:
where {$_.fullname -notlike "$Exclude*"}
You could also try
where {$_.fullname -notmatch [regex]::Escape($Exclude) }
but the notlike apporach is easier.
When used without wildcards the -like operator does the same as the -eq operator. If you want to exclude a folder T:\temp\Archive\cst and everything below it, you need something like this:
$Exclude = 'T:\temp\Archive\cst'
Get-ChildItem -Path T:\temp\Archive -Recurse -Directory | ? {
$_.FullName -ne $Exclude -and
$_.FullName -notlike "$Exclude\*"
} | ...
-notlike "$Exclude\*" would only exclude subfolders of $Exclude, not the folder itself, and -notlike "$Exclude*" would also exclude folders like T:\temp\Archive\cstring, which may be undesired.
The -contains operator is used to check if a list of values contains a particular value. It doesn't check if a string contains a particular substring.
See Get-Help about_Comparison_Operators for further information.
Try changing
$Exclude = "T:\temp\Archive\cst"
To:
$Exclude = "T:\temp\Archive\cst\*"
This will still return the folder CST as it is a child item of Archive, but will exclude anything under cst.
Or:
$Exclude = "T:\temp\Archive\cst*
But that will also exclude anyfiles that start with "cst" under Archive. Same goes for Graimer's answer, jsut be aware of the trailing \ and if it's important to what you are doing
For those looking for a similar answer, what I ended up going with (to parse an array paths for a wildcard match):
# Declare variables
[string]$rootdir = "T:\temp\Archive"
[String[]]$Exclude = "T:\temp\Archive\cst", "T:\temp\archive\as"
[int]$days = 90
# Create Directory list minus excluded directories and their children
$list = Get-ChildItem -Path $rootdir -Recurse -Directory | where {$path = $_.fullname; -not #($exclude | ? {$path -like $_ -or $path -like "$_\*" }) }
Provides what I needed.
Thought I would add to this as I recently had a similar problem answered. You can use the -notcontains condition, but the thing that is counter intuitive is that the $exclude array needs to be at the start of the expression.
Here is an example.
If I perform the following no items are excluded and it returns "a","b","c","d"
$result = #()
$ItemArray = #("a","b","c","d")
$exclusionArray = #("b","c")
$ItemArray | Where-Object { $_ -notcontains $exclusionArray }
If I switch the variables around in the expression then it works and returns "a","d".
$result = #()
$ItemArray = #("a","b","c","d")
$exclusionArray = #("b","c")
$ItemArray | Where-Object { $exclusionArray -notcontains $_ }
I am not sure why the arrays have to be this way around to work. If anyone else can explain that would be great.
EDITED 12/12/20 - I now know that the other operation to use is "-in" as in
$_ -notin $exclusionArray