Regex in powershell does not work as expected - regex

I store the output of a defragmentation analysis in a variable, then I try to match a pattern to retrieve a number.
In this following online regex tester, it works fine but in powershell, String -match $pattern returns false.
My code:
$result = Defrag C: /A /V | Out-String
echo $result
$pattern = "fragmenté[^.0-9]*([0-9]+)%"
$result -match $pattern
What am I doing wrong?

I actually had no issue with your code. I just needed to change the match to support my English output. It is possible that Wolfgang Kluge is onto something about the whitespace. However if your output actually matches what you have in the regex tester than i'm not sure what this issue you are having.
For fun I propose this update to your code. This uses ConvertFrom-StringData. I explain the code more in this answer.
$defrag = Defrag C: /A /V | out-string
$hash = (($defrag -split "`r`n" | Where-Object{$_ -match "="}) -join "`r`n" | ConvertFrom-StringData)
$result = New-Object -TypeName PSCustomObject -Property $hash
$result."Quantité totale d'espace fragmenté"
This is of course assuming that your PowerShell is perfectly OK with the accents in the words. On my ISE 3.0 that above code works.
Again... your code was working just fine for me in your question. I also don't think the Out-String is required. I still get positive output. With Out-String I get extra output that includes the entire matched line. Else I just get a boolean. In both (using the following code) I still get a result.
$result = Defrag C: /A /V #| Out-String
$pattern = "fragmented space[^.0-9]*([0-9]+)%"
$result -match $pattern
$Matches[1]
-match works as an array operator which changes how $result is treated. With Out-String $result is a System.String and without it you get System.Object
False Theory
The only way I can get the match to be False is if I am not running PowerShell as an administrator. That is important because if not you will get a message
The disk defragmenter cannot start because you have insufficient priveleges to perform this operation. (0x89000024)

Related

Regular expression seems not to work in Where-Object cmdlet

I am trying to add quote characters around two fields in a file of comma separated lines. Here is one line of data:
1/22/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
which I would like to become this:
1/22/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
I began developing my regular expression in a simple PowerShell script, and soon I have the following:
$strData = '1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0'
$strNew = $strData -replace "([^,]*),([^,]*),([^,]*),(.*)",'$1,"$2","$3",$4'
$strNew
which gives me this output:
1/29/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
Great! I'm all set. Extend this example to the general case of a file of similar lines of data:
Get-Content test_data.csv | Where-Object -FilterScript {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4'
}
This is a listing of test_data.csv:
1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0
This is the output of my script:
1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0
I have also tried this version of the script:
Get-Content test_data.csv | Where-Object -FilterScript {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",$4"
}
and obtained the same results.
My simple test script has convinced me that the regex is correct, but something happens when I use that regex inside a filter script in the Where-Object cmdlet.
What simple, yet critical, detail am I overlooking here?
Here is my PSVerion:
Major Minor Build Revision
----- ----- ----- --------
5 0 10586 117
You're misunderstanding how Where-Object works. The cmdlet outputs those input lines for which the -FilterScript expression evaluates to $true. It does NOT output whatever you do inside that scriptblock (you'd use ForEach-Object for that).
You don't need either Where-Object or ForEach-Object, though. Just put Get-Content in parentheses and use that as the first operand for the -replace operator. You also don't need the 4th capturing group. I would recommend anchoring the expression at the beginning of the string, though.
(Get-Content test_data.csv) -replace '^([^,]*),([^,]*),([^,]*)', '$1,"$2","$3"'
This seems to work here. I used ForEach-Object to process each record.
Get-Content test_data.csv |
ForEach-Object { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4' }
This also seems to work. Uses the ? to create a reluctant (lazy) capture.
Get-Content test_data.csv |
ForEach-Object { $_ -replace '(.*?),(.*?),(.*?),(.*)', '$1,"$2","$3",$4' }
I would just make a small change to what you have in order for this to work. Simply change the script to the following, noting that I changed the -FilterScript to a ForEach-Object and fixed a minor typo that you had on the last item in the regular expression with the quotes:
Get-Content c:\temp\test_data.csv | ForEach-Object {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",`"`$4"
}
I tested this with the data you provided and it adds the quotes to the correct columns.

powershell regex select string in variable

I am trying to create a script that select the four numbers that the company computer have in the host name.
I have tested the regex '\d{4}' in a regex web site, and it works fine to select the four numbers. but when using it with powershell y only get the $true or $false.
I need that the 4 numbers are keept in a variable for later use but i havent achieved it.
any ideas??
$machinename = "mac0016w701"
$test = $machinename -match '\d{4}'
$test2= Select-String -Pattern '\d{4}' -inputobject $machinename
$test2
-match is an operator which returns true/false, so you can use it in tests. If you want the values from the regex, it sets the magic variable $Matches, e.g.
PS D:\> 'computer1234' -match '\d{4}'
True
PS D:\> $matches[0]
1234
Alternately, you could use:
[regex]::Matches('computer1234', '\d{4}').Value

RegEx not matching when using Select String

I've verified that my regex is correct with this code:
#this is the string where I'm trying to extract everything within the []
$text = "MS14-012[2925418],MS14-029[2953522;2961851]"
$text -match "\[(.*?)\]"
$matches[1]
Output:
True
2925418
I'd like to use Select-String to get my result, like this for example:
$result = $text| Select-String -Pattern $regex
Output:
MS14-012[2925418],MS14-029[2953522;2961851]
What else I've tried:
$result = Select-String -Pattern $regex -InputObject $text
$result = Select-String -Pattern ([regex]::Escape("\[(.*?)\]")) -InputObject $text
And some more variations as well as different kinds of " and ' around the regex and so on. I'm really out of ideas...
Can anyone please tell me why the regex is not matching when I'm using Select-String?
After piping the output to Get-Member I noticed that Select-String returns a MatchInfo object and that I needed to access the MatchInfo.Matches property to get the result. Thanks to Mathias R. Jessen for giving me the hint! ;)

Powershell RegEx not being invoked on piped output

Developed this statement on my primary workstation where it (correctly) outputs a delimited textfile:
type output.tmp | -match /r /v "^-[-|]*-.$" > output.csv
Now, working on my laptop (same win8.1) where supposedly all the same PS modules and snapins are loaded, it tosses an error:
-match : The term '-match' is not recognized as the name of a cmdlet,
Yet:
"Software" –match "soft"
works.
1) Why?
2) Is there a PS commandlet I should invoke to be able to get a more verbose/helpful error output?
thx
-match is an operator on two arguments (one placed before one after the -match).
But at the beginning of each pipeline segment you need a command (including cmdlets)1.
There are two approaches:
Wrap the -match into a cmdlet like foreach-object (or its % alias):
... | %{ $_ -match $regex } | ...
remembering that -match returns a boolean, not the matched text.
Use Select-String which is a cmdlet explicitly included for searching text. This does return the matched text (along with some other information), and can read a file itself:
Select-String -Path $inputFile $regex
1 Strictly speaking: except the first, which can be any expression.
The reason for the error is that match is a comparison operator, not a cmdlet:
Comparison operators let you specify conditions for comparing values
and finding values that match specified patterns. To use a comparison
operator, specify the values that you want to compare together with an
operator that separates these values.
Also:
The match operators (-Match and -NotMatch) find elements that match or do not match a specified pattern using regular expressions.
The syntax is:
<string[]> -Match <regular-expression>
<string[]> -NotMatch <regular-expression>
The following examples show some uses of the -Match operator:
PS C:\> "Windows", "PowerShell" -Match ".shell"
PowerShell
PS C:\> (Get-Command Get-Member -Syntax) -Match "-view"
True
PS C:\> (Get-Command Get-Member -Syntax) -NotMatch "-path"
True
PS C:\> (Get-Content Servers.txt) -Match "^Server\d\d"
Server01
Server02
The match operators search only in strings. They cannot search in arrays of integers or other objects.
So, the correct syntax is:
#(type output.tmp) -match "^-[-|]*-.$" > output.csv
Note: Just as #mjolinor suggested, the # prefix forces the (type output.tmp) into an array, just in case that the input file contains only one line.
To get obnoxious amounts of debug output, get-Help Set-PSDebug
You simply need to add the following:
type output.tmp | ? { $_ -match /r /v "^-[-|]*-.$" } > output.csv
Or the more powershell-y way:
Get-Content -Path:"Output.Tmp" | Where { $_ -match "^-[-|]*-.$" } | Out-File -FilePath:"output.csv"

Multiline regex to match config block

I am having some issues trying to match a certain config block (multiple ones) from a file. Below is the block that I'm trying to extract from the config file:
ap71xx 00-01-23-45-67-89
use profile PROFILE
use rf-domain DOMAIN
hostname ACCESSPOINT
area inside
!
There are multiple ones just like this, each with a different MAC address. How do I match a config block across multiple lines?
The first problem you may run into is that in order to match across multiple lines, you need to process the file's contents as a single string rather than by individual line. For example, if you use Get-Content to read the contents of the file then by default it will give you an array of strings - one element for each line. To match across lines you want the file in a single string (and hope the file isn't too huge). You can do this like so:
$fileContent = [io.file]::ReadAllText("C:\file.txt")
Or in PowerShell 3.0 you can use Get-Content with the -Raw parameter:
$fileContent = Get-Content c:\file.txt -Raw
Then you need to specify a regex option to match across line terminators i.e.
SingleLine mode (. matches any char including line feed), as well as
Multiline mode (^ and $ match embedded line terminators), e.g.
(?smi) - note the "i" is to ignore case
e.g.:
C:\> $fileContent | Select-String '(?smi)([0-9a-f]{2}(-|\s*$)){6}.*?!' -AllMatches |
Foreach {$_.Matches} | Foreach {$_.Value}
00-01-23-45-67-89
use profile PROFILE
use rf-domain DOMAIN
hostname ACCESSPOINT
area inside
!
00-01-23-45-67-89
use profile PROFILE
use rf-domain DOMAIN
hostname ACCESSPOINT
area inside
!
Use the Select-String cmdlet to do the search because you can specify -AllMatches and it will output all matches whereas the -match operator stops after the first match. Makes sense because it is a Boolean operator that just needs to determine if there is a match.
In case this may still be of value to someone and depending on the actual requirement, the regex in Keith's answer doesn't need to be that complicated. If the user simply wants to output each block the following will suffice:
$fileContent = [io.file]::ReadAllText("c:\file.txt")
$fileContent |
Select-String '(?smi)ap71xx[^!]+!' -AllMatches |
%{ $_.Matches } |
%{ $_.Value }
The regex ap71xx[^!]*! will perform better and the use of .* in a regular expression is not recommended because it can generate unexpected results. The pattern [^!]+! will match any character except the exclamation mark, followed by the exclamation mark.
If the start of the block isn't required in the output, the updated script is:
$fileContent |
Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
%{ $_.Matches } |
%{ $_.Groups[1] } |
%{ $_.Value }
Groups[0] contains the whole matched string, Groups[1] will contain the string match within the parentheses in the regex.
If $fileContent isn't required for any further processing, the variable can be eliminated:
[io.file]::ReadAllText("c:\file.txt") |
Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
%{ $_.Matches } |
%{ $_.Groups[1] } |
%{ $_.Value }
This regex will search for the text ap followed by any number of characters and new lines ending with a !:
(?si)(a).+?\!{1}
So I was a little bored. I wrote a script that will break up the text file as you described (as long as it only contains the lines you displayed). It might work with other random lines, as long as they don't contain the key words: ap, profile, domain, hostname, or area. It will import them, and check line by line for each of the properties (MAC, Profile, domain, hostname, area) and place them into an object that can be used later. I know this isn't what you asked for, but since I spent time working on it, hopefully it can be used for some good. Here is the script if anyone is interested. It will need to be tweaked to your specific needs:
$Lines = Get-Content "c:\test\test.txt"
$varObjs = #()
for ($num = 0; $num -lt $lines.Count; $num =$varLast ) {
#Checks to make sure the line isn't blank or a !. If it is, it skips to next line
if ($Lines[$num] -match "!") {
$varLast++
continue
}
if (([regex]::Match($Lines[$num],"^\s.*$")).success) {
$varLast++
continue
}
$Index = [array]::IndexOf($lines, $lines[$num])
$b=0
$varObj = New-Object System.Object
while ($Lines[$num + $b] -notmatch "!" ) {
#Checks line by line to see what it matches, adds to the $varObj when it finds what it wants.
if ($Lines[$num + $b] -match "ap") { $varObj | Add-Member -MemberType NoteProperty -Name Mac -Value $([regex]::Split($lines[$num + $b],"\s"))[1] }
if ($lines[$num + $b] -match "profile") { $varObj | Add-Member -MemberType NoteProperty -Name Profile -Value $([regex]::Split($lines[$num + $b],"\s"))[3] }
if ($Lines[$num + $b] -match "domain") { $varObj | Add-Member -MemberType NoteProperty -Name rf-domain -Value $([regex]::Split($lines[$num + $b],"\s"))[3] }
if ($Lines[$num + $b] -match "hostname") { $varObj | Add-Member -MemberType NoteProperty -Name hostname -Value $([regex]::Split($lines[$num + $b],"\s"))[2] }
if ($Lines[$num + $b] -match "area") { $varObj | Add-Member -MemberType NoteProperty -Name area -Value $([regex]::Split($lines[$num + $b],"\s"))[2] }
$b ++
} #end While
#Adds the $varObj to $varObjs for future use
$varObjs += $varObj
$varLast = ($b + $Index) + 2
}#End for ($num = 0; $num -lt $lines.Count; $num = $varLast)
#displays the $varObjs
$varObjs
To me, a very clean and simple approach is to use a multiline bloc regex, with named captures, like this:
# Based on this text configuration:
$configurationText = #"
ap71xx 00-01-23-45-67-89
use profile PROFILE
use rf-domain DOMAIN
hostname ACCESSPOINT
area inside
!
"#
# We can build a multiline regex bloc with the strings to be captured.
# Here, i am using the regex '.*?' than roughly means 'capture anything, as less as possible'
# A more specific regex can be defined for each field to capture.
# ( ) in the regex if for defining a group
# ?<> is for naming a group
$regex = #"
(?<userId>.*?) (?<userCode>.*?)
use profile (?<userProfile>.*?)
use rf-domain (?<userDomain>.*?)
hostname (?<hostname>.*?)
area (?<area>.*?)
!
"#
# Lets see if this matches !
if($configurationText -match $regex)
{
# it does !
Write-Host "Config text is successfully matched, here are the matches:"
$Matches
}
else
{
Write-Host "Config text could not be matched."
}
This script outputs the following:
PS C:\Users\xdelecroix> C:\FusionInvest\powershell\regex-capture-multiline-stackoverflow.ps1
Config text is successfully matched, here are the matches:
Name Value
---- -----
hostname ACCESSPOINT
userProfile PROFILE
userCode 00-01-23-45-67-89
area inside
userId ap71xx
userDomain DOMAIN
0 ap71xx 00-01-23-45-67-89...
For more flexibility, you can use Select-String instead of -match, but this is not really important here, in the context of this sample.
Here's my take. If you don't need the regex, you can use -like or .contains(). The question never says what the search pattern is. Here's an example with a windows text file.
$file = (get-content -raw file.txt) -replace "`r" # avoid the line ending issue
$pattern = 'two
three
f.*' -replace "`r"
# just showing what they really are
$file -replace "`r",'\r' -replace "`n",'\n'
$pattern -replace "`r",'\r' -replace "`n",'\n'
$file -match $pattern
$file | select-string $pattern -quiet