Can't seem to get RegEx to match - regex

I am trying to extract the Get-Help comment headers from a PowerShell script...using PowerShell. The file I'm reading looks something like this:
<#
.SYNOPSIS
Synopsis goes here.
It could span multiple lines.
Like this.
.DESCRIPTION
A description.
It could also span multiple lines.
.PARAMETER MyParam
Purpose of MyParam
.PARAMETER MySecondParam
Purpose of MySecondParam.
Notice that this section also starts with '.PARAMETER'.
This one should not be captured.
...and many many more lines like this...
#>
# Rest of the script...
I would like to get all the text below .DESCRIPTION, up to the first instance of .PARAMETER. So the desired output would be:
A description.
It could also span multiple lines.
Here's what I've tried:
$script = Get-Content -Path "C:\path\to\the\script.ps1" -Raw
$pattern = '\.DESCRIPTION(.*?)\.PARAMETER'
$description = $script | Select-String -Pattern $pattern
Write-Host $description
When I run that, $description is empty. If I change $pattern to .*, I get the entire contents of the file, as expected; So there must be something wrong with my RegEx pattern, but I can't seem to figure it out.
Any ideas?

(get-help get-date).description
The `Get-Date` cmdlet gets a DateTime object that represents the current date
or a date that you specify. It can format the date and time in several Windows
and UNIX formats. You can use `Get-Date` to generate a date or time character
string, and then send the string to other cmdlets or programs.
(get-help .\script.ps1).description

the Select-String cmdlet works on entire strings and you have given it ONE string. [grin]
so, instead of fighting with that, i went with the -match operator. the following presumes you have loaded the entire file into $InStuff as one multiline string with -Raw.
the (?ms) stuff is two regex flags - multiline & singleline.
$InStuff -match '(?ms)(DESCRIPTION.*?)\.PARAMETER'
$Matches.1
output ...
DESCRIPTION
A description.
It could also span multiple lines.
note that there is a blank line at the end. you likely will want to trim that away.

In the words of #Mathias R. Jessen:
Don't use regex to parse PowerShell code in PowerShell
Use the PowerShell parser instead!
So, let's use PowerShell to parse PowerShell:
$ScriptFile = "C:\path\to\the\script.ps1"
$ScriptAST = [System.Management.Automation.Language.Parser]::ParseFile($ScriptFile, [ref]$null, [ref]$null)
$ScriptAST.GetHelpContent().Description
We use the [System.Management.Automation.Language.Parser]::ParseFile() to parse our file and ouput an Abstract Syntax Tree (AST).
Once we have the Abstract Syntax Tree, we can then use the GetHelpContent() method (exactly what Get-Help uses) to get our parsed help content.
Since we are only interested in the Description portion, we can simply access it directly with .GetHelpContent().Description

Related

Extract text between tags using PowerShell

I have an XML file that includes many instances of a particular tag/element. I am trying to capture each of these and then dump them into a new file.
I have the following script which does work, in that it takes the first occurrence of the text I am after and displays it to the console.
I am trying to incorporate foreach-object to retrieve all occurrences of ...allContent... but am failing to ge it added correctly.
Here is my working script that displays the output I am after for the first occurrence only.
$firstString = "<RunListItems>"
$secondString = "</RunListItems>"
#Get content from file
$file = Get-Content "C:\Users\Bob\Desktop\ps\order.xml"
#Regex pattern to compare two strings
$pattern = "$firstString(.*?)$secondString"
#Perform the opperation
$result = [regex]::Match($file,$pattern).Groups[1].Value
#Return result
return $result
Parsing XML text with regular expressions is brittle and therefore ill-advised.
PowerShell provides easy access to proper XML parsers, and the in case at hand you can use the Select-Xml cmdlet:
Select-Xml //RunListItems C:\Users\Bob\Desktop\ps\order.xml |
ForEach-Object { $_.Node.InnerText }
//RunListItems is an XPath query that selects all elements whose tag name is RunListItems throughout the document, irrespective of their position in the hierarchy (//)
The .Node property of the output objects (of type Microsoft.PowerShell.Commands.SelectXmlInfo) contains the matching element, and its .InnerText property returns its text content.
Note: If your XML document uses namespaces, you must pass a hashtable with prefix-to-URI mappings to Select-Xml's -Namespace parameter, and use these prefixes in the XPath query (-XPath) when referring to elements - see this answer for more information.
To save the output strings to a file, separated with newlines, simply append something like
| Set-Content out.txt; use Set-Content's -Encoding parameter to control the encoding, if needed.[1]
[1] In Windows PowerShell (versions up to 5.1), Set-Content defaults to the active ANSI code page. In PowerShell (Core) 7+, the consistent default across all cmdlets is BOM-less UTF-8. See this answer for more information.

PowerShell Select-String regular expression to locate two strings on the same line

How do I use Select-String cmdlet to search a text file for a string which starts with a specific string, then contains random text and has another specific string towards the end of the line? I'm only interested in matches across a single line in the text file, not across the entire file.
For example I am searching to match both 'Set-QADUser' and 'WhatIf' on the same line in the file. And my example file contains the following line:
Set-QADUser -Identity $($c.ObjectGUID) -ObjectAttributes #{extensionattribute7=$ekdvalue} -WhatIf | Out-Null
How do I use Select-String along with a Regular Expression to locate the pattern in question? I tried using the following and it does work but it also matches other instances of either 'Set-QADUser' or 'WhatIf' found elsewhere in the text file and I only want to match instances when both search strings are found on the same line.
Select-String -path "test.ps1" -Pattern "Set-QADUser.*WhatIf" | Select Matches,LineNumber
To make this more complicated I actually want to perform this search from within the script file that is being searched. Effectively this is used to warn the user that the script being run is currently set to 'WhatIf' mode for testing. But of course the regEx matches the text from the actual Select-String cmd within the script when it's run - so it finds multiple matches and I can't figure out a very good way to overcome that issue. So far this is what I've got:
#Warn user about 'WhatIf' if detected
$line=Select-String -path $myinvocation.mycommand.name -Pattern "Set-QADUser.*WhatIf" | Select Matches,LineNumber
If ($line.Count -gt 1)
{
Write-Host "******* Warning ******"
Write-Host "Script is currently in 'WhatIf' mode; to make changes please remove '-WhatIf' parameter at line no. $($line[1].LineNumber)"
}
I'm sure there must be a better way to do this. Hope somebody can help.
Thanks
If you use the -Quiet switch on Select-String it will just return a boolean True/False, depending on whether it found a match or not.
-Quiet <SwitchParameter>
Returns a Boolean value (true or false), instead of a MatchInfo object. The value is "true" if the pattern is found; otherwise, the value is "false".
Required? false
Position? named
Default value Returns matches
Accept pipeline input? false
Accept wildcard characters? false

Replace special character in powershell

For one of my daily manually find and replace task I want to create a powershell script to find and replace some network location and replace with new one.
Like :-
To find : \\Test\test
Replace with \\Test1\Test1
I am able to replace text without any special characters but for above I am getting regular expression error.
And also after replacing above string I want to save result in a text file which logs records of files which have been updated.
Seeing your example in the comment, just needs an Add-Content snip to complete it. This is how I would do it.
$c = '\\filedep\iservershare\DCSVWA'
$c = $c -replace [regex]::Escape('\\filedep\iservershare\DCSVWA'),('\\EGSISFS01\VolksWagon\DCSVWA')
Add-Content C:\Output.txt $c
I've run it twice and the file resulted in.
\\EGSISFS01\VolksWagon\DCSVWA
\\EGSISFS01\VolksWagon\DCSVWA

How to trim the file modification value from SVN log output with PowerShell

I have an SVN log being captured in PowerShell which I am then trying to modify and string off everything except the file URL. The problem I am having is getting a regex to remove everything before the file URL. My entry is matched as:
M /trunk/project/application/myFile.cs
There are two spaces at the beginning which originally I was trying to replace with a Regex but that did not seem to work, so I use a trim and end up with:
M /trunk/project/application/myFile.cs
Now I want to get rid of the File status indicator so I have a regular expression like:
$entry = $entry.Replace("^[ADMR]\s+","")
Where $entry is the matched file URL but this doesn't seem to do anything, even removing the caret to just look for the value and space did not do anything. I know that $entry is a string, I originally thought Replace was not working as $entry was not a string, but running Get-Member during the script shows I have a string type. Is there something special about the svn file indicator or is the regex somehow off?
Given your example string:
$entry = 'M /trunk/project/application/myFile.cs'
$fileURL = ($entry -split ' /')[1]
Your regex doesn't work because string.Replace just does a literal string replacement and doesn't know about regexes. You'd probably want [Regex]::Replace or just the -replace operator.
But when using SVN with PowerShell, I'd always go with the XML format. SVN allows a --xml option to all commands which then will output XML (albeit invalid if it dies in between).
E.g.:
$x = [xml](svn log -l 3 --verbose --xml)
$x.log.logentry|%{$_.paths}|%{$_.path}|%{$_.'#text'}
will give you all paths.
But if you need a regex:
$entry -replace '^.*?\s+'
which will remove everything up to (and including) the first sequence of spaces which has the added benefit that you don't need to remember what characters may appear there, too.

Remove Unused Functions in AutoIt Script with PowerShell

Alrighty..
So I am editing an AutoIt script that has a lot of unused functions in it. The original author saw fit to add all the functions from his/her includes files.
At first I tried to use the tools within AutoIt/SciTe to remove unused functions however for some freakish reason this rendered the script/compiled file useless. So now I am thinking it would be best to write a function remover.
Here is what I have so far:
Search for lines with "Func _" count number of times that function appears in the file. If 1 time then Select String
$FileName=".\FILENAME.au3"
$File=Get-Content $FileName
$Funcs=$File|Select-String "Func _"
foreach ($Func in $Funcs) {
$FuncName=$Func.ToString().Split('( ')[1]
$Count=($File|Select-String $FuncName | Measure-Object).Count
if ($count -eq 1) {
$File|Select-String "Func _" $FuncName
}
}
What I would like to do is remove the function, likely with regex. So something like:
REMOVE "Func _"$func * "EndFunc"
The trouble has been that this is a search that spans multiple lines, from Func _NAMEOFFUCTION to EndFunc. Its unclear to me if regex in PowerShell can even do this. Not all regex implementations seem to be able to span a search across lines. Is regex even the answer? I don't know.
When you use Get-Content in PowerShell 1.0 or 2.0 you can only get back an array of strings - one for each line. This isn't going to work when you need a regex to span multiple lines. Use this approach to read the file as a single string:
$FileContent = [io.file]::ReadAllText($FileName)
If you are on PowerShell V3 you can use the -Raw parameter to read the file as a single string:
$FileContent = Get-Content $FileName -Raw
Then when you use Select-String you will need to modify the regex to enable singleline s (and probably multiline m) mode e.g.:
$FileContent | Select-String "(?smi)$FuncName" -AllMatches
Note the i is there to be case-insensitive. Use the -AllMatches parameter to match multiple function definitions within a file.
Here's a regex that should match an AutoIt function definition. It assumes the Func and EndFunc keywords are always placed at the beginning of a line and that they're case sensitive. The function name is captured in the group named FuncName (in C# you would access it via Groups["FuncName"]);
"(?m)^Func\s+\b(?<FuncName>_\w+\b).*\n(?:(?!EndFunc\b).*\n)*EndFunc.*\n?"
For the function names alone you can use "\b_\w+\b" or maybe "\b_[A-Za-z]+\b"; I don't know how strict you need to be). Having almost zero experience with PowerShell, I would probably use [regex]::Matches and [regex]::Replace to do the work. I don't know if PS offers a better way.
I'm assuming you've read the whole file into a string as #Keith suggested, not line by line as you were doing originally.