I extract string containing a lot of text and both MAC address and UUID.
For example:
![LOG[AA:AA:AA:AA:AA:AA, 0A0A0000-0000-0000-0000-A0A00A000000: found optional advertisement C0420054]LOG]!><time="09:07:57.573-120" date="04-19-2017" component="SMSPXE" context="" type="1" thread="2900" file="database.cpp:533"
I would like to strip the output to only display the MAC Address (e.g AA:AA:AA:AA:AA:AA) and UUID (e.g 0A0A0000-0000-0000-0000-A0A00A000000)
I donĀ“t know how to trim the output.
Here is my script:
$Path = "\\AAAAAAAA\logs$"
$Text = "AA:AA:AA:AA:AA:AA"
$PathArray = #()
$Results = "C:\temp\test.txt"
# This code snippet gets all the files in $Path that end in ".txt".
Get-ChildItem $Path -Filter "*.log" |
Where-Object { $_.Attributes -ne "Directory"} |
ForEach-Object {
If (Get-Content $_.FullName | Select-String -Pattern $Text) {
$PathArray += $_.FullName
$PathArray += $_.FullName
}
}
Write-Host "Contents of ArrayPath:"
$PathArray | ForEach-Object {$_}
get-content $PathArray -ReadCount 1000 |
foreach { $_ -match $Text}
Instead of using the Where-Object cmdlet to filter all files, you can use the -Filter switch of the Get-ChildItem cmdlet. Also you don't have to load the content using the Get-content cmdlet yourself, just pipe the files to the Select-String cmdlet.
To grab MAC, UUID I just googled both regex and combined them:
$Path = "\\AAAAAAAA\logs$"
$Pattern = '([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2}),\s+(\{{0,1}([0-9a-fA-F]){8}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){12}\}{0,1})'
$Results = "C:\temp\test.txt"
Get-ChildItem $Path -Filter "*.log" -File |
Select-String $Pattern |
ForEach-Object {
$_.Matches.Value
} |
Out-File $Results
Related
I'm trying to find all .git\config files but I cannot figure out how to do it.
When I just use \.git as a pattern, it finds all directories
Get-ChildItem -Path C:\Repositories -Recurse -Hidden |
Where-Object {
$_.FullName -match '\.git'
} |
Select-Object FullName
but when I exapnd it to \.git\\config$ to give me only the config files it yields no results:
Get-ChildItem -Path C:\Repositories -Recurse -Hidden |
Where-Object {
$_.FullName -match '\.git\\config$'
} |
Select-Object FullName
What am I missing here? Is it because the config file does not have an extension?
I'm using powershell 5.1.
config isn't a hidden file, so only showing hidden files is causing it to be ignored.
Get-ChildItem -Recurse -Force | where-object fullname -match '\.git\\config'
I am using a PowerShell command to find all *.vue files (it's a simple text format) in a directory, where I need to match this:
7,Id
6,Default
So, these are 2 consecutive lines. With Notepad++ I see CRLF at the end of the line. Following Google searches, this must be close:
Get-ChildItem "D:\Wim\TM1\TI processes" -Filter *.vue -Recurse |
Select-String -Pattern "7,Id\r\n6,Default" -CaseSensitive |
Out-File C:\test.txt
But it does not find the files. I checked that I can find the first part (7,Id) correctly, and also the second part (6,Default), but the combination with the newline is not working.
Any ideas please? Maybe an alternative?
I can have a workaround but it's inefficient and a lot of coding. For example, I could use PowerShell to provide a list of only the first sentence, then process these files to see if it matches the second sentence as well. I want to avoid that.
You need to pass the content of the file as a single string, otherwise Select-String will apply the pattern to each line separately.
Get-ChildItem "D:\Wim\TM1\TI processes" -Filter *.vue -Recurse | ForEach-Object {
Get-Content $_.FullName | Out-String |
Select-String -Pattern "7,Id\r\n6,Default" -CaseSensitive |
Select-Object -Expand Matches |
Select-Object -Expand Groups |
Select-Object -Expand Value
} | Out-File C:\test.txt
On PowerShell v3 and newer you can use Get-Content -Raw instead of Get-Content | Out-String.
As an alternative to Select-String you could use the -cmatch operator in a Where-Object filter:
Get-ChildItem "D:\Wim\TM1\TI processes" -Filter *.vue -Recurse | ForEach-Object {
Get-Content $_.FullName | Out-String | Where-Object {
$_ -cmatch "7,Id\r\n6,Default"
} | ForEach-Object {
$matches[0]
}
} | Out-File C:\test.txt
With Select-String, the -Pattern parameter is regex capable, so try this:
Get-ChildItem "D:\Wim\TM1\TI processes" -Filter *.vue -Recurse |
Select-String -Pattern "7,Id|6,Default" -CaseSensitive |
Out-File C:\test.txt
The vertical pipe bar (|) acts as an alternative separator, or in otherwords, an "or" operator. With the pattern it will match either.
I'm running the following script to check a group of files for card numbers. When I run it against a group of 38 files that are a total of 600mb, it consumes max cpu (50% restricted) and max memory (3.3GB of 4.0GB physical).
Looking for ideas on why this may be and how to optimize this.
Thanks!
Get-ChildItem "c:\REGEX\ScanMeFiles\" -Recurse |`
Foreach-Object{
$content = Get-Content $_.FullName
$outfile = 'c:\regex\results\'+$_.BaseName+'_results.log'
$content | Where-Object {$_ -match '\b(?:3[47]\d|(?:4\d|5[1-5]|65)\d{2}|6011)\d{12}\b'} | Set-Content $outfile
}
I would make it a little more contained. Do something like this with fewer variables:
$children = (Get-ChildItem).FullName
foreach($child in $children){
Get-Content $child | ?{$_ -match '\b(?:3[47]\d|(?:4\d|5[1-5]|65)\d{2}|6011)\d{12}\b'} | Set-Content ('c:\regex\results\'+$_.BaseName+'_results.log')
}
With Matt's help, this is what I came up with. Runs in <1 minute against my test data. thanks!
Get-ChildItem "c:\REGEX\ScanMeFiles\" |
Foreach-Object{
$content = $_.FullName
$outfile = 'c:\regex\results\'+$_.BaseName+'_results.log'
$regex = '\b(?:3[47]\d|(?:4\d|5[1-5]|65)\d{2}|6011)\d{12}\b'
select-string -Path $content -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } | Set-Content $outfile
I need to search file names in a directory for position based characters. I am looking for files with parenthesis within parenthesis.
like this:
# 2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN
GET-CHILDITEM C:\BU\p -recurse | WHERE-OBJECT {$_.nAME -MATCH "(?!)((?!)((!?))(!?))(!?)"}
I also need to match any file with 4+ letters and no parenthesis. ie:
# 2277131504-03_Haltebolzen_platte
GET-CHILDITEM C:\BU\p -EXCLUDE "*)*" -recurse | WHERE-OBJECT {$_.nAME -MATCH "\W\.[^\W]"}
I've got this:
$tests = #(
'2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN',
'2277131504-03_Haltebolzen_platte'
)
$regex = '^.*\(.*\(.*\).*\).*$|^[^()]*[a-z]{4}[^()]*$'
$tests -match $regex
2262281102-03_Cutting_Plate_Lower_Stop_(Anschlag_Cutting_Frame_(Schnittgestell)_unten)_400kN
2277131504-03_Haltebolzen_platte
I have a list of regular expressions(about 2000) and over a million html files. I want to check if each regular expression success on every file or not. How to do this on powershell?
Performance is important, so I don't want to loop through regular expressions.
I try
$text | Select-String -Pattern pattern1, pattern2,...
And it returns all matches, but I also want to find out, which pattern success which one not. I need to build a list of success regular expressions for each file
You could try something like this:
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
Get-ChildItem -Filter *.txt | Select-String -Pattern $regex | ForEach-Object {
$ht[$_.Path] += #($_ | Select-Object -ExpandProperty Pattern)
}
Test-output:
$ht | Format-Table -AutoSize
Name Value
---- -----
C:\Users\graimer\Desktop\New Text Document (2).txt {e2$}
C:\Users\graimer\Desktop\New Text Document.txt {^test, e2$}
You didn't specify how you wanted the output.
UPDATE: To match multiple patterns on a single line, try this(mjolinor's answer is probably faster then this).
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
$regex | ForEach-Object {
$pattern = $_
Get-ChildItem -Filter *.txt | Select-String -Pattern $pattern | ForEach-Object {
$ht[$_.Path] += #($_ | Select-Object -ExpandProperty Pattern)
}
}
UPDATE2: I don't have enough samples to try it, but since you have such a huge amount of files, you migh want to try reading the file into memory before looping through the patterns. It may be faster.
$regex = "^test","e2$" #Or use (Get-Content <path to your regex file>)
$ht = #{}
#Modify Get-Childitem to your criterias(filter, path, recurse etc.)
Get-ChildItem -Filter *.txt | ForEach-Object {
$text = $_ | Get-Content
$filename = $_.FullName
$regex | ForEach-Object {
$text | Select-String -Pattern $_ | ForEach-Object {
$ht[$filename] += #($_ | Select-Object -ExpandProperty Pattern)
}
}
}
I don't see any way around doing a foreach through the regex collection.
This is the best I could come up with performance-wise:
$regexes = 'pattern1','pattern2'
$files = get-childitem -Path <file path> |
select -ExpandProperty fullname
$ht = #{}
foreach ($file in $files)
{
$ht[$file] = New-Object collections.arraylist
foreach ($regex in $regexes)
{
if (select-string $regex $file -Quiet)
{
[void]$ht[$file].add($regex)
}
}
}
$ht
You could speed up the process by using background jobs and dividing up the file collection among the jobs.