I need to search file names in a directory for position based characters. I am looking for files with parenthesis within parenthesis.
like this:
# 2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN
GET-CHILDITEM C:\BU\p -recurse | WHERE-OBJECT {$_.nAME -MATCH "(?!)((?!)((!?))(!?))(!?)"}
I also need to match any file with 4+ letters and no parenthesis. ie:
# 2277131504-03_Haltebolzen_platte
GET-CHILDITEM C:\BU\p -EXCLUDE "*)*" -recurse | WHERE-OBJECT {$_.nAME -MATCH "\W\.[^\W]"}
I've got this:
$tests = #(
'2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN',
'2277131504-03_Haltebolzen_platte'
)
$regex = '^.*\(.*\(.*\).*\).*$|^[^()]*[a-z]{4}[^()]*$'
$tests -match $regex
2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN
2277131504-03_Haltebolzen_platte
Related
I have almost 400 .sql files where i need to search for a specific pattern and output the results.
e.g
*file1.sql
select * from mydb.ops1_tbl from something1 <other n lines>
*file2.sql
select * from mydb.ops2_tbl from something2 <other n lines>
*file3.sql
select * from mydb.ops3_tbl ,mydb.ops4_tbl where a = b <other n lines>
Expected result
file1.sql mydb.ops1_tbl
file2.sql mydb.ops2_tbl
file3.sql mydb.ops3_tbl mydb.ops4_tbl
Below script in powershell - able to fetch the filename
Get-ChildItem -Recurse -Filter *.sql|Select-String -pattern "mydb."|group path|select name
Below script in powershell - able to fetch the line
Get-ChildItem -Recurse -Filter *.sql | Select-String -pattern "mydb." |select line
I need in the above format, someone has any pointers regarding this?
you need to escape the dot in a RegEx to match a literal dot with a backslash \.
to get all matches on a line use the parameter -AllMatches
you need a better RegEx to match the mydb string upto the next space
iterate the Select-string results with a ForEach-Object
A one liner:
Get-ChildItem -Recurse -Filter *.sql|Select-String -pattern "mydb\.[^ ]+" -Allmatches|%{$_.path+" "+($_.Matches|%{$_.value})}
broken up
Get-ChildItem -Recurse -Filter *.sql|
Select-String -Pattern "mydb\.[^ ]+" -Allmatches | ForEach-Object{
$_.path+" "+($_.Matches|ForEach-Object{$_.value})
}
Sample output:
Q:\Test\2019\01\24\file1.sql mydb.ops1_tbl
Q:\Test\2019\01\24\file2.sql mydb.ops2_tbl
Q:\Test\2019\01\24\file3.sql mydb.ops3_tbl mydb.ops4_tbl
If you don't want the full path (despite you are recursing) like your Expected result,
replace $_.path with (Split-Path $_.path -Leaf)
First, fetch the result of your file query into an array, then iterate over it and extract the file contents using regex matching:
$files = Get-ChildItem -Recurse -Filter *.sql|Select-String -pattern "mydb."|group path|select name
foreach ($file in $files)
{
$str = Get-Content -Path $file.Name
$matches = ($str | select-string -pattern "mydb\.\w+" -AllMatches).Matches.Value
[console]::writeline("{0:C} {1:C}", $file.Name, [string]::Join(' ', $matches) )
}
I used the .NET WriteLine function to output the result for demonstration purpose only.
I am trying to move files to a certain folder if they start with a letter and delete them if they start with anything other than a letter.
My code:
Function moveOrDelete($source, $dest)
{
$aToZ = '^[a-zA-Z].*'
$notALetter = '^[^a-zA-Z].*'
Get-ChildItem -Path $source\$aToZ -Recurse | Move-Item -Destination $dest
Get-ChildItem -Path $source\$notALetter -Recurse | Remove-Item
}
As I understand it the caret will match on the first character when it's outside of the brackets. In other words, the regex in the $aToZ variable will match anything that begins with a letter. the .* part will allow the rest of the file name to be anything. The caret inside the brackets negates the statement so if the file name begins with anything other than a letter it will match. I can't get it to work and I'm not getting any errors which leads me to believe that my regex is wrong.
I have checked this with online tools including this one: https://regex101.com/ and they check out.
I have also used variations of the regex like ^[a-zA-Z] that don't work. Some patterns like [a-zA-Z]* move the files but it's not the pattern that I want.
Here is how I'm calling the funcion:
moveOrDelete ".\source" ".\dest"
And here are the sample file names I'm using:
a.txt
z.txt
1.txt
.txt
The -Path argument doesn't understand regular expressions, it takes a string and can perform wildcarding but not complex string processing.
So, you need to check the name of each file against the regex with the -match operator. The following should help:
Function moveOrDelete($source, $dest)
{
$aToZ = '^[a-zA-Z].*'
$notALetter = '^[^a-zA-Z].*'
Get-ChildItem -Path $source -Recurse | Where-Object { $_.name -match $aToZ } | Move-Item -Destination $dest
Get-ChildItem -Path $source -Recurse | Where-Object { $_.name -match $notALetter } | Remove-Item
}
Here, you need to filter the file names with the Where-Object cmdlet, then pipe to the move or remove.
I am looking for a way to remove several special characters from filenames via a powershell script.
My filenames look like this:
[Report]_first_day_of_month_01_(generated_by_powershell)_[repnbr1].txt
I have been puzzling over removing the [] and everything between them, the () and everything between those, and removing all the _'s as well, with the desired result being a filename that looks like this:
first day of month 01.txt
Thus far, I have tried the below solution to no avail. I have run each of these from the directory in which the files reside.
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
$NewName2 = $NewName -Replace "[\s*]\[.*\]",''
$NewName3 = $NewName2 -Replace "_",' '
Rename-Item -Path $_ -NewName $NewName3
}
Since it does not work even if I try and do one set at a time like this:
Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
Rename-Item -Path $_ -NewName $NewName
}
I assume there is an inherent flaw in the way I am trying to accomplish this task. That being said, I would prefer to use the Rename-Item cmdlet rather than using a move-item solution.
gci *.txt | Rename-Item -NewName {$_ -replace '_*(\[.*?\]|\(.*?\))_*' -replace '_+', ' '}
The rename is a regex which matches [text] or (text) blocks and replaces them with nothing. Parentheses and brackets need escaping in regexes to match them literally. It matches them with optional leading or trailing underscores to get [Report]_ or _[repnbr1] because it would leave _ at the start or end of the name and they would become leading/trailing spaces, which is annoying. Then it replaces remaining underscores with spaces.
See the regex working here: Regex101
I'm trying to rename several files using a regex expression.
ck1823000-23.dat
ck1293834-67.dat
lo1230324-99.dat
pk1232131-34.dat
...
I want to remove -XX
So the result would be like this:
ck1823000.dat
ck1293834.dat
lo1230324.dat
pk1232131.dat
...
I came up with this regex:
(?:.*?)([-\\s].*?).dat
But I get this error:
Rename-Item : The input to the script block for parameter 'NewName'
failed. The regular expression pattern is not valid
When I run this command:
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "\(?:.*?)([-\\s].*?).dat\", ""}
Use the below regex and then replace the matched characters with an empty string.
-[^.-]*(?=\\.dat)
DEMO
Get-ChildItem . -file -Filter "*.dat" | Rename-Item -newname { $_.name -replace "-[^.-]*(?=\\.dat)", ""}
Another option you can use basename instead of name property
Get-ChildItem . -file -Filter "*.dat" |
Rename-Item -newname { $_.basename -replace "-.*"}
I have a list of regular expressions(about 2000) and over a million html files. I want to check if each regular expression success on every file or not. How to do this on powershell?
Performance is important, so I don't want to loop through regular expressions.
I try
$text | Select-String -Pattern pattern1, pattern2,...
And it returns all matches, but I also want to find out, which pattern success which one not. I need to build a list of success regular expressions for each file
You could try something like this:
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
Get-ChildItem -Filter *.txt | Select-String -Pattern $regex | ForEach-Object {
$ht[$_.Path] += #($_ | Select-Object -ExpandProperty Pattern)
}
Test-output:
$ht | Format-Table -AutoSize
Name Value
---- -----
C:\Users\graimer\Desktop\New Text Document (2).txt {e2$}
C:\Users\graimer\Desktop\New Text Document.txt {^test, e2$}
You didn't specify how you wanted the output.
UPDATE: To match multiple patterns on a single line, try this(mjolinor's answer is probably faster then this).
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
$regex | ForEach-Object {
$pattern = $_
Get-ChildItem -Filter *.txt | Select-String -Pattern $pattern | ForEach-Object {
$ht[$_.Path] += #($_ | Select-Object -ExpandProperty Pattern)
}
}
UPDATE2: I don't have enough samples to try it, but since you have such a huge amount of files, you migh want to try reading the file into memory before looping through the patterns. It may be faster.
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
Get-ChildItem -Filter *.txt | ForEach-Object {
$text = $_ | Get-Content
$filename = $_.FullName
$regex | ForEach-Object {
$text | Select-String -Pattern $_ | ForEach-Object {
$ht[$filename] += #($_ | Select-Object -ExpandProperty Pattern)
}
}
}
I don't see any way around doing a foreach through the regex collection.
This is the best I could come up with performance-wise:
$regexes = 'pattern1','pattern2'
$files = get-childitem -Path <file path> |
select -ExpandProperty fullname
$ht = #{}
foreach ($file in $files)
{
$ht[$file] = New-Object collections.arraylist
foreach ($regex in $regexes)
{
if (select-string $regex $file -Quiet)
{
[void]$ht[$file].add($regex)
}
}
}
$ht
You could speed up the process by using background jobs and dividing up the file collection among the jobs.