Using powershell, I am trying to determine which perl scripts in a directory are not called from any other script. In my Select-String I am grouping the matches because there is some other logic I use to filter out results where the line is commented, and a bunch of other scenarios I want to exclude(for simplicity I excluded that from the code posted below). My main problem is in the "-notin" part.
I can get this to work if I remove the grouping from Select-string and only match the filename itself. So this works.
$searchlocation = "C:\Temp\"
$allresults = Select-String -Path "$searchlocation*.pl" -Pattern '\w+\.pl'
$allperlfiles = Get-Childitem -Path "$searchlocation*.pl"
$allperlfiles | foreach-object -process{
$_ | where {$_.name -notin $allresults.matches.value} | Select -expandproperty name | Write-Host
}
However I cannot get the following to work. The only difference between this and above is the value for the "-Pattern" and the value after "-notin". I'm not sure how to use "notin" along with matching groups.
$searchlocation = "C:\Temp\"
$allresults = Select-String -Path "$searchlocation*.pl" -Pattern '(.*?)(\w+\.pl)'
$allperlfiles = Get-Childitem -Path "$searchlocation*.pl"
$allperlfiles | foreach-object -process{
$_ | where {$_.name -notin $allresults.matches.groups[2].value} | Select -expandproperty name | Write-Host}
At a high level the code should search all perl scripts in a directory for any lines that execute any other perl script. With that I now have $allresults which basically gives me a list of all perl scripts called from other files. To get the inverse of that(files that are NOT called from any other file) I get a list of all perl scripts in the directory, cycle through those and list out the ones that DONT show up in $allresults.
When you select a grouping you need to do so using a Select statement, or iteratively in a loop, otherwise you are only going to select the value from the Nth match.
IE if your $Allresults object contains
File.pl, File 2.pl, File 3.pl
Then $allresults.Matches.Groups[2].value Only Returns File2.pl
Instead, you need to select those values!
$allresults | select #{N="Match";E={ $($_.Matches.Groups[2].value) } }
Which will return:
Match
-----
File1.pl
File2.pl
File3.pl
In your specific example, each match has three sub-items, the results will be completely sequential, so what you would term "match 1, group 1" is groups[0] while "match 2, group 1" is groups[3]
This means the matches you care about (those with grouping 2) are in the array values contained in the set {2,5,8,11,...,etc.} or can be described as (N*3-1) Where N is the number of the match. So For Match 1 = (1*3)-1 = [2]; while For Match 13 = (13*3)-1 = [38]
You can iterate through them using a loop to check:
for($i=0; $i -le ($allresults.Matches.groups.count-1); $i++){
"Group[$i] = ""$($allresults.Matches.Groups[$i].value)"""
}
I noticed that you took the time to avoid loops in collecting your data, but then accidentally seem to have fallen prey to using one in matching your data.
Not-In and other compares when used by the select and where clauses don't need a loop structure and are faster if not looped, so you can forego the Foreach-object loop and have a better process just by using a simple Where (?).
$SearchLocation = "C:\Temp\"
$FileGlob = "*.pl"
$allresults = Select-String -Path "$SearchLocation$FileGlob" -Pattern '(.*?)([\w\.]+\.bat)'
$allperlfiles = Get-Childitem -Path "$SearchLocation$FileGlob"
$allperlfiles | ? {
$_.name -notin $(
$allresults | select #{N="Match";E={ $($_.Matches.Groups[2].value) } }
)
} | Select -expandproperty name | Write-Host
Now, that should be faster and simpler code to maintain, but, as you may have noticed, it still has some redundancies now that you are not looping.
As you are piping it all into a Select which can do the work of the where, and what's more you only are looking to match the NAME property here so you can either for-go the last select by only piping the name of the file in the first place, or you can forgo the where and select exactly what you want.
I think the former is far simpler, and the latter is useful if you are going to actually do something with those other values inside the loop that we don't know yet.
Finally, Write-host is likely redundant as any object output will echo to the console.
Here is that version which incorporates the removal of the unnecessary loops and removes redundancies related to the output of the info you wanted, all together.
$SearchLocation = "C:\Temp\"
$FileGlob = "*.pl"
$allresults = Select-String -Path "$SearchLocation$FileGlob" -Pattern ('(.*?)([\w\.]+\'+$FileGlob+')')
$allperlfiles = Get-Childitem -Path "$SearchLocation$FileGlob"
$allperlfiles.name | ? {
$_ -notin $(
$allresults | select #{
N="Match";E={
$($_.Matches.Groups[2].value)
}
}
)
}
Related
I have a file with lines that i wish to remove like the following:
key="Id" value=123"
key="FirstName" value=Name1"
key="LastName" value=Name2"
<!--key="FirstName" value=Name3"
key="LastName" value=Name4"-->
key="Address" value=Address1"
<!--key="Address" value=Address2"
key="FirstName" value=Name1"
key="LastName" value=Name2"-->
key="ReferenceNo" value=765
have tried the following: `
$values = #('key="FirstName"','key="Lastname"', 'add key="Address"');
$regexValues = [string]::Join('|',$values)
$lineprod = Get-Content "D:\test\testfile.txt" | Select-String $regexValues|Select-Object -
ExpandProperty Line
if ($null -ne $lineprod)
{
foreach ($value in $lineprod)
{
$prod = $value.Trim()
$contentProd | ForEach-Object {$_ -replace $prod,""} |Set-Content "D:\test\testfile.txt"
}
}
The issue is that only some of the lines get replaced and or removed and some remain.
The output should be
key="Id" value=123"
key="ReferenceNo" value=765
But i seem to get
key="Id" value=123"
key="ReferenceNo" value=765
<!--key="Address" value=Address2"
key="FirstName" value=Name1"
key="LastName" value=Name2"-->
Any ideas as to why this is happening or changes to the code above ?
Based on your comment, the token 'add key="Address"' should be changed for just 'key="Address"' then the concatenating logic to build your regex looks good. You need to use the -NotMatch switch so it matches anything but those values. Also, Select-String can read files, so, Get-Content can be removed.
Note, the use of (...) in this case is important because you're reading and writing to the same file in the same pipeline. Wrapping the statement in parentheses ensure that all output from Select-String is consumed before passing it through the pipeline. Otherwise, you would end up with an empty file.
$values = 'key="FirstName"', 'key="Lastname"', 'key="Address"'
$regexValues = [string]::Join('|', $values)
(Select-String D:\test\testfile.txt -Pattern $regexValues -NotMatch) |
ForEach-Object Line | Set-Content D:\test\testfile.txt
Outputs:
key="Id" value=123"
key="ReferenceNo" value=765
Let's say I have a file with a following content:
1,first_string,somevalue
2,second_string,someothervalue
n,n_nd_string,somemorevalue
I need to Get-Content of this file, get the last string and get the number before the "," symbol (n in this case). (I'll just increment it and append n+1 string to this file, but it does not matter right now). I want all this stuff be done with pipeline cascade
I have come to this solution so far:
[int]$number = Get-Content .\1.mpcpl | Select-Object -Last 1 | Select-String -Pattern '^(\d)' | ForEach-Object {$_.matches.value}
It actually works, but I wonder if there are any ways of addressing Select-String -Pattern '^(\d)' return object without using the foreach loop? Beacause I know that the return collection in my case will only consist of a 1 element (I'm selecting a single last string of a file and I get only one match)
You may use
$num = [int]((Get-Content .\1.mpcpl | Select-Object -Last 1) -replace '^(\d).*', '$1')
Notes
(Get-Content .\1.mpcpl | Select-Object -Last 1) - reads the file and gets the last line
(... -replace '^(\d).*', '$1') - gets the first digit from the returned line (NOTE: if the line does not start with a digit, it will fail as the output will be the whole line)
[int] casts the string value to an int.
Another way can be getting a match and then retrieving it from the default $matches variable:
[IO.File]::ReadAllText($filepath) -match '(?m)^(\d).*\z' | Out-Null
[int]$matches[1]
The (?m)^(\d).*\z pattern gets the digit at the start of the last line into Group 1, hence $matches[1].
It looks like a csv to me...
import-csv 1.mpcpl -header field1,field2,field3 | select -last 1 | select -expand field1
output:
n
The CSV file only contains a partial entry from the last regex match.
I've used the ISE debugger and can verify it's finding matches.
$h = #{}
$a = #()
Get-ChildItem C:\Users\speterson\Documents\script\*.kiy | foreach {
Get-Content $_ | foreach {
if ($_ -match 'IF Ingroup\s+\(\s+\"(..+?)\"\s+\)') {
$h.Group = $matches[1]
}
if ($_ -match 'use\s+([A-Za-z]):"(\\\\..*?\\..*)\"')) {
$h.DriveLetter = $matches[1].ToUpper()
$h.Path = $matches[2]
}
}
$a += New-Object PSCustomObject -Property $h
}
$a | Export-Csv c:\temp\Whatever.csv -NoTypeInfo
The input files look like this, but have 1000+ lines in them:
IF Ingroup ( "RPC3WIA01NT" )
use v: /del
ENDIF
IF Ingroup ( "JWA03KRONOSGLOBAL" )
use v:"\\$homesrvr\$dept"
ENDIF
IF Ingroup ( "P-USERS" )
use p:'\\PServer\PDRIVE
ENDIF
CSV file only shows:
GROUP
P-USERS
I want to ignore the drive letters with the /del.
I'm trying to get a CSV file that shows
Group Drive Path
JWA03KRONOSGLOBAL V \\$homesrvr\$dept
P-USERS P \\PServer\PDRIVE
Your code has two loops, one nested in the other. The outer loop processes each file from the Get-ChildItem call. The inner loop processes the content of the current file of the outer loop. However, since you're creating your objects after the inner loop finished you're only getting the last result from each processed file. Move object creation into the inner loop to get all results from all files.
I'd also recommend not re-using a hashtable. Re-using objects always bears the risk of having data carried over somewhere undesired. Hashtable creation is so inexpensive that running that risk is never worth it.
On top of that your processing of the files' content is flawed, because the inner loop processes the content one line at a time, but both of your conditionals match on different lines and are not linked to each other. If you created a new object with every iteration that would give you incorrect results. Read the file as a whole and then use Select-String with a multiline regex to extract the desired information.
Another thing to avoid is appending to an array in a loop (that's a slow operation because it involves re-creating the array and copying elements over and over). Since you're using ForEach-Object you can pipe directly into Export-Csv.
Something like this should work:
$re = 'IF Ingroup\s+\(\s+"(.+?)"\s+\)\s+' +
"use\s+([a-z]):\s*[`"'](\\\\[^`"'\s]+)"
Get-ChildItem 'C:\Users\speterson\Documents\script\*.kiy' | ForEach-Object {
Get-Content $_.FullName |
Out-String |
Select-String $re -AllMatches |
Select-Object -Expand Matches |
ForEach-Object {
New-Object -Type PSObject -Property #{
'Group' = $_.Groups[1].Value
'DriveLetter' = $_.Groups[2].Value
'Path' = $_.Groups[3].Value
}
}
} | Export-Csv 'C:\path\to\output.csv' -NoType
I'm importing a KIX file:
$KIXOLD = get-content E:\File.kix
The file contains content such as this:
$ScriptVer = "12.0" ; Current Script Version Number
I need to get the script version, in this case 12.0, however that number can vary based upon which file I'm importing.
I've tried Select-String and regex like this:
$OLDVER = $KIXOLD | Select-String -Pattern "\$ScriptVer = `"\d\d\.\d`""
But that still grabs the entire line including ; Current Script Version Number and not just the $scriptver = "12.0"
I'd imagine this has to be simple and I'm just going about it all wrong, but nothing I've tried has worked for me.
The end goal would be to just get 12.0 as an int, increment it and replace it, but I can't get that far until I can isolate the $scriptver = "12.0" from the rest of the multi-thousand line KIX file
try this
get-content "E:\File.kix" | where {$_ -like '$ScriptVer*'} | %{$_.split( '=;"')[2]}
Other mehod :
$template=#"
{Row*:ScriptVer = "{Version:12.0}" ; Xxx}
"#
(get-content "E:\File.kix" | ConvertFrom-String -TemplateContent $template).Row.Version
Does this help?
$OLDVER = [regex]::Match($KIXOLD,"ScriptVer = ([1-9]\d.\d)").groups[1].value
Select-String outputs Regex MatchInfo objects. To just get the value, you need to match against it, with a group of the text you want, then expand the match, then expand the group, then expand the value, e.g.
$Ver = Select-String -Path E:\File.kix -Pattern '^\$ScriptVer = "(.*?)"' |
Select-Object -ExpandProperty Matches |
ForEach-Object { $_.Groups[1].Value }
I have a fixed width file with records in a format as follows
DDEDM2018890 19960730015000010000
DDETPL015000 20150515015005010000
DDETPL015010 20150515015003010000
DDETPL015020 20150515015002010000
DDETPL015030 20150515015005010000
DDETPL015040 20150515015000010000
the first 3 characters identify the record type, in the above example all records are of type DDE but there are also lines of a different type in the file.
the following regular expression with named capture groups parses the relevant information from each record for my purpose (notice it also filters down to DDE record types:
DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})
play with this regex on this excellent online parser
I have written a script that uses the Get-Content, ForEach-Object and Select-Object cmdlets to convert the fixed width file into a csv file.
I wonder if I could replace the Get-Content and ForEach-Object cmdlets by a single Select-String cmdlet?
#this powershell script reads fixed width file and generates a csv file of the relevant & converted values
#Prepare HashSet object for Select-Object to convert CategoryCode and append with CategoryId
$Category = #{
Name = "Category"
Expression = {
$cat = switch($_.CategoryCode)
{
"50"{"A"}
"54"{"C"}
"60"{"F"}
"66"{"I"}
"74"{"M"}
"88"{"T"}
}
$cat+$_.CategoryId
}
}
gc "C:\Path\To\File.txt" | % {
if($_ -match "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3}).*$")
{
#$matches is a hashset of named capture groups, convert to object to allow Select-Object to handle hashset elements as object properties
[PSCustomObject]$matches
}
} | select Database, $Category, Length #| export-csv "AnalysisLengths.csv" -NoTypeInformation
Before I finalized the script, I was trying to use the Select-String cmdlet but could not figure out how to use it, I believe it can achieve the same result in a more eloquent way... this is what I had:
##Could this be completed with just the Select-String commandlet instead of Get-Content+ForEach+Select-Object?
Select-String -Path "C:\Path\To\File.txt" `
-Pattern "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})" `
| Select-Object -ExpandProperty Matches
Using -ExpandProperty should convert the Microsoft.PowerShell.Commands.MatchInfo Matches property into the actual System.Text.RegularExpressions.Match objects for each line...
see also Powershell Select-Object vs ForEach on Select-String results
Here is one way (I'am not so proud of it)
Select-String -Path "C:\Path\To\File.txt" -Pattern "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})" | %{New-Object -TypeName PSObject -Property #{Database=$_.matches.groups[1];CategoryCode=$_.matches.groups[2];CategoryId=$_.matches.groups[3];Length=$_.matches.groups[4]}} | export-csv "C:\Path\To\File.csv"
I don't know why you have limited your question to Select-String cmdlet. If you had included the switch statement, then, I'd answer to you: YES! It's possible!
And I'd present to you this simple and short PowerShell code:
$(switch -Regex -File $fileIN{$patt{[pscustomobject]$matches|select * -ExcludeProperty 0}})|epcsv $fileCSV`
where $fileIN is the input file, $fileCSV is CSV file you wanna create, and $patt is the pattern you have in your OP:
$patt='DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})'`
The switch statement is very powerful.
While Select-String can combine Get-Content and pattern matching, you still need a loop for constructing your custom objects. You could stick with what you have, although I'd suggest a couple modifications. Replace the switch statement with a hashtable and make the nested if a Where-Object filter:
$categories = #{
'50' = 'A'
'54' = 'C'
'60' = 'F'
'66' = 'I'
'74' = 'M'
'88' = 'T'
}
$category = #{
Name = 'Category'
Expression = { $categories[$_.CategoryCode] + $_.CategoryId }
}
$pattern = 'DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})'
Get-Content 'C:\path\to\file.txt' |
? { $_ -match $pattern } |
% { [PSCustomObject]$matches } |
select Database, $category, Length |
Export-Csv 'C:\path\to\output.csv' -NoType
Or you could go with #JPBlanc's suggestion (again with some slight modifications):
$category = #{
'50' = 'A'
'54' = 'C'
'60' = 'F'
'66' = 'I'
'74' = 'M'
'88' = 'T'
}
$pattern = "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})"
Select-String -Path 'C:\path\to\file.txt' -Pattern $pattern | % {
New-Object -TypeName PSObject -Property #{
Database = $_.Matches.Groups[1].Value
Category = $category[$_.Matches.Groups[2].Value] + $_.Matches.Groups[3].Value
Length = $_.Matches.Groups[4].Value
}
} | Export-Csv 'C:\path\to\output.csv' -NoType
The latter will give you slightly better performance, although not too much (execution times were 2:35 vs 2:50 for a 120 MB input file on my test box).