Powershell RegEx not being invoked on piped output - regex
Developed this statement on my primary workstation where it (correctly) outputs a delimited textfile:
type output.tmp | -match /r /v "^-[-|]*-.$" > output.csv
Now, working on my laptop (same win8.1) where supposedly all the same PS modules and snapins are loaded, it tosses an error:
-match : The term '-match' is not recognized as the name of a cmdlet,
Yet:
"Software" –match "soft"
works.
1) Why?
2) Is there a PS commandlet I should invoke to be able to get a more verbose/helpful error output?
thx
-match is an operator on two arguments (one placed before one after the -match).
But at the beginning of each pipeline segment you need a command (including cmdlets)1.
There are two approaches:
Wrap the -match into a cmdlet like foreach-object (or its % alias):
... | %{ $_ -match $regex } | ...
remembering that -match returns a boolean, not the matched text.
Use Select-String which is a cmdlet explicitly included for searching text. This does return the matched text (along with some other information), and can read a file itself:
Select-String -Path $inputFile $regex
1 Strictly speaking: except the first, which can be any expression.
The reason for the error is that match is a comparison operator, not a cmdlet:
Comparison operators let you specify conditions for comparing values
and finding values that match specified patterns. To use a comparison
operator, specify the values that you want to compare together with an
operator that separates these values.
Also:
The match operators (-Match and -NotMatch) find elements that match or do not match a specified pattern using regular expressions.
The syntax is:
<string[]> -Match <regular-expression>
<string[]> -NotMatch <regular-expression>
The following examples show some uses of the -Match operator:
PS C:\> "Windows", "PowerShell" -Match ".shell"
PowerShell
PS C:\> (Get-Command Get-Member -Syntax) -Match "-view"
True
PS C:\> (Get-Command Get-Member -Syntax) -NotMatch "-path"
True
PS C:\> (Get-Content Servers.txt) -Match "^Server\d\d"
Server01
Server02
The match operators search only in strings. They cannot search in arrays of integers or other objects.
So, the correct syntax is:
#(type output.tmp) -match "^-[-|]*-.$" > output.csv
Note: Just as #mjolinor suggested, the # prefix forces the (type output.tmp) into an array, just in case that the input file contains only one line.
To get obnoxious amounts of debug output, get-Help Set-PSDebug
You simply need to add the following:
type output.tmp | ? { $_ -match /r /v "^-[-|]*-.$" } > output.csv
Or the more powershell-y way:
Get-Content -Path:"Output.Tmp" | Where { $_ -match "^-[-|]*-.$" } | Out-File -FilePath:"output.csv"
Related
Move directories that match given regex
I try to move directories with their contents. Names of the directories are letters followed by digits: a2,a2321, sdadsa2321321, so the regex would be [a-zA-Z]+\d+. However it doesn't work. $SourceDirectoryPath = "C:/directory/[a-zA-Z]+\d+" $TargetFilePath = "C:/directory/target" New-Item -ItemType "directory" -Path $TargetFilePath Move-Item -Path $SourceDirectoryPath -Destination $TargetFilePath -Force If I replace [a-zA-Z]+\d+ with simple wildcards like a* it moves moves multiples directories, this proves that [a-zA-Z]+\d+ is the only incorrect part of the script. Question: What is the correct form of the regex [a-zA-Z]+\d+ in Powershell? This regex is fully correct in Java, but for some reason it doesn't work here.
Maybe this is what you want: $sourceDir = 'D:\source' $destDir = 'D:\destination' $pattern = '^.*[a-zA-Z]+\d+$' $baseDir = Get-ChildItem -Path $sourceDir -Recurse -Directory foreach( $directory in $baseDir ) { if( $directory.Name -match $pattern ) { Move-Item -Path $directory.FullName -Destination $destDir -Force } }
To use regex matching on files and folders with Get-ChildItem, you will need to use the Where-Object clause. This should do it: $SourceDirectoryPath = 'C:\directory' $TargetFilePath = 'C:\directory\target' # create the target path if it does not exist yet if (!(Test-Path -Path $TargetFilePath -PathType Container)) { $null = New-Item -Path $TargetFilePath -ItemType Directory } Get-ChildItem -Path $SourceDirectoryPath -Directory | Where-Object { $_.Name -match '^[a-z]+\d+$' } | ForEach-Object { $_ | Move-Item -Destination $TargetFilePath -Force }
If I replace [a-zA-Z]+\d+ with simple wildcards like a* it moves moves multiples directories, this proves that [a-zA-Z]+\d+ is the only incorrect part of the script. Indeed: The -Path parameter of file-related cmdlets can only accept wildcard expressions (see about_Wildcards), not regular expressions (regexes) (see about_Regular_Expressions). While distantly related, the two types of expressions are syntactically different: wildcard expressions are conceptually and syntactically simpler, but far less powerful - and not powerful enough for your use case. See AdminOfThings' comment on the question for a quick intro. Also note that many PowerShell cmdlets conveniently also accept wildcards in other types of arguments (unrelated to the filesystem), such as Get-Command *job* allowing you to find all available commands whose name contains the word job. By contrast, use of regexes always requires a separate, explicit operation (unless a command is explicitly designed to accept regexes as arguments), via operators such as -match and -replace, cmdlets such as Select-String, or the switch statement with the -Regex option. In your case, you need to filter the directories of interest from among all subdirectories, by combining the Where-Object cmdlet with -match, the regular-expression matching operator; the syntactically simplest form is to use an operation statement (a cleaner alternative to passing a script block { ... } in which $_ must be used to refer to the input object at hand), as shown in the following command: # Define the *parent* path of the dirs. to move. # The actual dirs. must be filtered by regex below. $SourceDirectoryParentPath = 'C:/directory' $TargetFilePath = 'C:/directory/target' # Note: If you add -Force, no error occurs if the directory already exists. # New-Item produces output, a System.IO.DirectoryInfo in this case. # To suppress the output, use: $null = New-Item ... New-Item -ItemType "directory" -Path $TargetFilePath # Enumerate the child directories of the parent path, # and filter by whether each child directory's name matches the regex. Get-ChildItem -Directory $SourceDirectoryParentPath | Where-Object Name -match '^[a-z]+\d+$' | Move-Item -Destination $TargetFilePath -Force Note that I've changed regex [a-zA-Z]+\d+ to ^[a-z]+\d+$, because: PowerShell's regex matching is case-insensitive by default, so [a-z] covers both upper- and lowercase (English) letters. The -match operator performs substring matching, so you need to anchor the regex with ^ (match at the start) and $ match at the end in order to ensure tha the entire input string matches your expression. Also note that I've used a single-quoted string ('...') rather than a double-quoted one ("..."), which is preferable for regex literals, so that no confusion arises between what characters are seen by the regex engine, and which characters PowerShell itself may interpolate, beforehand, notably $ and `.
Issues finding and replacing strings in PowerShell
I'm rather new to PowerShell and I'm trying to write a PowerShell script to convert some statements in VBScript to Microsoft JScript. Here is my code: $vbs = 'C:\infile.vbs' $js = 'C:\outfile.js' (Get-Content $vbs | Set-Content $js) (Get-Content $js) | Foreach-Object { $_ -match "Sub " } | Foreach-Object { "$_()`n`{" } | Foreach-Object { $_ -replace "Sub", "function" } | Out-File $js Foreach-Object { $_ -match "End Sub" } | Foreach-Object { $_ -replace "End Sub", "`}" } | Out-File $js Foreach-Object { $_ -match "Function " } | Foreach-Object { "$_()`n`{" } | Foreach-Object { $_ -replace "Function", "function" } | Out-File $js Foreach-Object { $_ -match "End Function" } | Foreach-Object { $_ -replace "End Function", "`}" } | Out-File $js What I want is for my PowerShell program to take the code from the VBScript input file infile.vbs, convert it, and output it to the JScript output file outfile.js. Here is an example of what I want it to do: Input file: Sub HelloWorld (Code Here) End Sub Output File: function HelloWorld() { (Code Here) } Something similar would happen with regard to functions. From there, I would tweak the code manually to convert it. When I run my program in PowerShell v5.1, it does not show any errors. However, when I open outfile.js, I see only one line: False So really, I have two questions. 1. Why is this happening?2. How can I fix this program so that it behaves how I want it to (as detailed above)? Thanks, Gabe
You could also do this with the switch statement. Like so: $vbs = 'C:\infile.vbs' $js = 'C:\outfile.js' Get-Content $vbs | ForEach-Object { switch -Regex ($_) { 'Sub '{ 'function {0}(){1}{2}' -f $_.Remove($_.IndexOf('Sub '),4).Trim(),[Environment]::NewLine,'{' } 'End Sub'{ '}' } 'Function ' { 'function {0}(){1}{2}' -f $_.Remove($_.IndexOf('Function '),9).Trim(),[Environment]::NewLine,'{' } 'End Function' { '}' } default { $_ } } } | Out-File $js
As for question #2 (How can I fix this program [...]?): Kirill Pashkov's helpful answer offers an elegant solution based on the switch statement. Note, however, that his solution: is predicated on Sub <name> / Function <name> statement parts not being on the same line as the matching End Sub / End Function parts - while this is typically the case, it isn't a syntactical requirement; e.g., Sub Foo() WScript.Echo("hi") End Sub - on a single line - works too. in line with your own solution attempt, blindly appends () to Sub / Function definitions, which won't work with input procedures / functions that already have parameter declarations (e.g., Sub Foo (bar, baz)). The following solution: also works with single-line Sub / Function definition correctly preserves parameter declarations Get-Content $vbs | ForEach-Object { $_ -replace '\b(?:sub|function)\s+(\w+)\s*(\(.*?\))', 'function $1$2 {' ` -replace '\bend\s+(?:sub|function)\b', '}' } | Out-File $js The above relies heavily on regexes (regular expressions) to transform the input; for specifics on how regex matching results can be referred to in the -replace operator's replacement-string operand, see this answer. Caveat: There are many other syntax differences between VBScript and JScript that your approach doesn't cover, notably that VBScript has no return statement and instead uses <funcName> = ... to return values from functions. As for question #1: However, when I open outfile.js, I see only one line: False [...] 1. Why is this happening? All but the first ForEach-Object cmdlet call run in separate statements, because the initial pipeline ends with the first call to Out-File $js. The subsequent ForEach-Object calls each start a new pipeline, and since each pipeline ends with Out-File $js, each such pipeline writes to file $js - and thereby overwrites whatever the previous one wrote. Therefore, it is the last pipeline that determines the ultimate contents of file $js. A ForEach-Object that starts a pipeline receives no input. However, its associated script block ({...}) is still entered once in this case, with $_ being $null[1]: The last pipeline starts with Foreach-Object { $_ -match "End Function" }, so its output is the equivalent of $null -match "End Function", which yields $False, because -match with a scalar LHS (a single input object) outputs a Boolean value that indicates whether a match was found or not. Therefore, given that the middle pipeline segment (Foreach-Object { $_ -replace "End Function", "}" }) is an effective no-op ($False is stringified to 'False', and the -replace operator therefore finds no match to replace and passes the stringified input out unmodified), Out-File $js receives string 'False' and writes just that to output file $js. Even if you transformed your separate commands into a single pipeline with a single Out-File $js segment at the very end, your command wouldn't work, however: Given that Get-Content sends the input file's lines through the pipeline one by one, something like $_ -match "Sub " will again produce a Boolean result - indicating whether the line at hand ($_) matched string "Sub " - and pass that on. While you could turn -match into a filter by making the LHS an array - by enclosing it in the array-subexpression operator #(...); e.g., #($_) -match "Sub " - that would: pass line that contain substring Sub through as a whole, and omit lines that don't. In other words: This wouldn't work as intended, because: lines that do not contain a matching substring would be omitted from the output, and the lines that do match are reflected in full in $_ in the next pipeline segment - not just the matched part. [1] Strictly speaking, $_ will retain whatever value it had in the current scope, but that will only be non-$null if you explicitly assigned a value to $_ - given that $_ is an automatic variable that is normally controlled by PowerShell itself, however, doing so is ill-advised - see this GitHub discussion.
OK there is a few things wrong with this script. Foreach-Object otherwise known as % is to iterate every item in a pipe. Example is #(1..10) | %{ "This is Array Item $_"} This will out put 10 lines counting the array items. In you current script you are using this where a Where-Object also known as ? should be. #(1..10) | ?{ $_ -gt 5 } This will output all numbers greater then 5. A example of what you are kind of trying to go for is something like function ConvertTo-JS([string]$InputFilePath,[string]$SaveAs){ Get-Content $InputFilePath | %{$_ -replace "Sub", "function"} | %{$_ -replace "End Function", "}"} | %{$_ -replace "Function", "function"} | %{$_ -replace "End Function", "}" } | Out-File $SaveAs } ConvertTo-JS -InputFilePath "C:\TEST\TEST.vbs" -SaveAs "C:\TEST\TEST.JS" This doesnt take into account adding a { at the beginning of a function or adding the () ether. But with the information provided hopefully that puts you on the right track.
Regular expression seems not to work in Where-Object cmdlet
I am trying to add quote characters around two fields in a file of comma separated lines. Here is one line of data: 1/22/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0 which I would like to become this: 1/22/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0 I began developing my regular expression in a simple PowerShell script, and soon I have the following: $strData = '1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0' $strNew = $strData -replace "([^,]*),([^,]*),([^,]*),(.*)",'$1,"$2","$3",$4' $strNew which gives me this output: 1/29/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0 Great! I'm all set. Extend this example to the general case of a file of similar lines of data: Get-Content test_data.csv | Where-Object -FilterScript { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4' } This is a listing of test_data.csv: 1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0 This is the output of my script: 1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0 1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0 I have also tried this version of the script: Get-Content test_data.csv | Where-Object -FilterScript { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",$4" } and obtained the same results. My simple test script has convinced me that the regex is correct, but something happens when I use that regex inside a filter script in the Where-Object cmdlet. What simple, yet critical, detail am I overlooking here? Here is my PSVerion: Major Minor Build Revision ----- ----- ----- -------- 5 0 10586 117
You're misunderstanding how Where-Object works. The cmdlet outputs those input lines for which the -FilterScript expression evaluates to $true. It does NOT output whatever you do inside that scriptblock (you'd use ForEach-Object for that). You don't need either Where-Object or ForEach-Object, though. Just put Get-Content in parentheses and use that as the first operand for the -replace operator. You also don't need the 4th capturing group. I would recommend anchoring the expression at the beginning of the string, though. (Get-Content test_data.csv) -replace '^([^,]*),([^,]*),([^,]*)', '$1,"$2","$3"'
This seems to work here. I used ForEach-Object to process each record. Get-Content test_data.csv | ForEach-Object { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4' } This also seems to work. Uses the ? to create a reluctant (lazy) capture. Get-Content test_data.csv | ForEach-Object { $_ -replace '(.*?),(.*?),(.*?),(.*)', '$1,"$2","$3",$4' }
I would just make a small change to what you have in order for this to work. Simply change the script to the following, noting that I changed the -FilterScript to a ForEach-Object and fixed a minor typo that you had on the last item in the regular expression with the quotes: Get-Content c:\temp\test_data.csv | ForEach-Object { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",`"`$4" } I tested this with the data you provided and it adds the quotes to the correct columns.
Regex in powershell does not work as expected
I store the output of a defragmentation analysis in a variable, then I try to match a pattern to retrieve a number. In this following online regex tester, it works fine but in powershell, String -match $pattern returns false. My code: $result = Defrag C: /A /V | Out-String echo $result $pattern = "fragmenté[^.0-9]*([0-9]+)%" $result -match $pattern What am I doing wrong?
I actually had no issue with your code. I just needed to change the match to support my English output. It is possible that Wolfgang Kluge is onto something about the whitespace. However if your output actually matches what you have in the regex tester than i'm not sure what this issue you are having. For fun I propose this update to your code. This uses ConvertFrom-StringData. I explain the code more in this answer. $defrag = Defrag C: /A /V | out-string $hash = (($defrag -split "`r`n" | Where-Object{$_ -match "="}) -join "`r`n" | ConvertFrom-StringData) $result = New-Object -TypeName PSCustomObject -Property $hash $result."Quantité totale d'espace fragmenté" This is of course assuming that your PowerShell is perfectly OK with the accents in the words. On my ISE 3.0 that above code works. Again... your code was working just fine for me in your question. I also don't think the Out-String is required. I still get positive output. With Out-String I get extra output that includes the entire matched line. Else I just get a boolean. In both (using the following code) I still get a result. $result = Defrag C: /A /V #| Out-String $pattern = "fragmented space[^.0-9]*([0-9]+)%" $result -match $pattern $Matches[1] -match works as an array operator which changes how $result is treated. With Out-String $result is a System.String and without it you get System.Object False Theory The only way I can get the match to be False is if I am not running PowerShell as an administrator. That is important because if not you will get a message The disk defragmenter cannot start because you have insufficient priveleges to perform this operation. (0x89000024)
Powershell regex to match vhd or vhdx at the end of a string
I'm brand new to powershell and I'm trying to write a script to copy files ending in vhd or vhdx I can enumerate a directory of files like so: $NEWEST_VHD = Get-ChildItem -Path $vhdSourceDir | Where-Object Name -match ".vhdx?" This will match foo.vhd foo.vhdx However this will also match foo.vhdxxxx How can I write a match that will only match files ending in exactly vhd or vhdx ? Unsuccessful attempts Where-Object Name -match ".vhdx?" Where-Object Name -like ".vhdx?" Where-Object Name -match ".[vhd]x?" Where-Object Name -match ".[vhd]\^x?" Resources I've investigated http://ss64.com/ps/syntax-regex.html https://technet.microsoft.com/en-us/library/ff730947.aspx http://www.regexr.com/
Put a $ at the end of your pattern: -match ".vhdx?$" $ in a Regex pattern represents the end of the string. So, the above will only match .vhdx? if it is at the end. See a demonstration below: PS > 'foo.vhd' -match ".vhdx?$" True PS > 'foo.vhdx' -match ".vhdx?$" True PS > 'foo.vhdxxxx' -match ".vhdx?$" False PS > Also, the . character has a special meaning in a Regex pattern: it tells PowerShell to match any character except a newline. So, you could experience behavior such as: PS > 'foo.xvhd' -match ".vhdx?$" True PS > If this is undesirable, you can add a \ before the . PS > 'foo.xvhd' -match "\.vhdx?$" False PS > This tells PowerShell to match a literal period instead.
If you only want to check extension, then you can just use Extension property instead of Name: $NEWEST_VHD = Get-ChildItem -Path $vhdSourceDir | Where-Object Extension -in '.vhd','.vhdx'
Mostly just an FYI but there is no need for a regex solution for this particular issue. You could just use a simple filter. $NEWEST_VHD = Get-ChildItem -Path $vhdSourceDir -Filter ".vhd?" Not perfect but if you dont have files called ".vhdz" then you would be safe. Again, this is not meant as an answer but just useful to know. Reminder that ? in this case optionally matches a single character but it not regex just a basic file system wildcard. Depending on how many files you have here you could argue that this would be more efficient since you will get all the files you need off the get go instead of filtering after the fact.