PowerShell regex to get just hex part in strings - regex

I'm working on a function that gets the map of string key and it's hex value. I got the string key part working, but I'm having trouble getting the hex part to work. This is my function so far:
function Get-Contents4_h{
[cmdletbinding()]
Param ([string]$fileContent)
#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A
# create an ordered hashtable to store the results
$errorMap = [ordered]#{}
# process the lines one-by-one
switch -Regex ($fileContent -split '\r?\n') {
'define ([\w]*)' { # Error_Failed_To_Do_ #this works fine
$key = ($matches[1]).Trim()
}
'([0x\w]*)' { # 0x04A etc #this does not work
$errorMap[$key] = ($matches[1]).Trim()
}
}
# output the completed data as object
#[PsCustomObject]$errorMap
return $errorMap
}
I'm going to be looping through the returned map and matching the hex value with the key in another object.
This is what the string parameter to the function looks like:
#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A
For some reason my
0x\w
regex is not returning anything in regex101.com. I've had luck with that with other hex numbers but not this time.
I've tried this and other variations as well: ^[\s\S]*?#[\w]*[\s\S]+([0x\w]*)
This is with powershell 5.1 and VS Code.

You need to remove the [...] range construct around 0x\w - the 0x occurs exactly once in the input string, and the following characters appears at least once - but the expression [0x\w]* could be satisfied by an empty string (thanks to the *, 0-or-more quantifier).
I'd suggest matching the whole line at once with a single pattern instead:
switch -Regex ($fileContent -split '\r?\n') {
'^\s*#define\s+(\w+)\s+(0x\w+)' {
$key,$value = $Matches[1,2] |ForEach-Object Trim
$errorMap[$key] = $value
}
}

This works for me. The square brackets match any one character inside them at a time. The pattern with the square brackets has 18 matches in this line, the first match being empty string ''. Regex101.com says the same thing (null). https://regex101.com/r/PZ8Y8C/1 This would work 0x[\w]*, but then you might as well drop the brackets. I made an example data file and then a script on how I would do it.
'#define Error_Failed_To_Do_A 0x81A0 /* random comments */' |
select-string [0x\w]* -AllMatches | % matches | measure | % count
18
'#define Error_Failed_To_Do_A 0x81A0 /* random comments */
#define Error_Failed_To_Do_B 0x810A' |
set-content file.txt
# Get-Contents4_h.ps1
Param ($file)
switch -Regex -File $file {
'define (\w+).*(0x\w+)' {
[pscustomobject]#{
Error = $matches[1]
Hex = $matches[2]
}
}
}
.\Get-Contents4_h file.txt
Error Hex
----- ---
Error_Failed_To_Do_A 0x81A0
Error_Failed_To_Do_B 0x810A

Related

using a partial variable as part of regular expression string

I'm trying to loop through an array of alarm codes and use a regular expression to look it up in cpp code. I know my regex works when I hard code a value in and use double quotes for my regex, but I need to pass in a variable because it's a list of about 100 to look up with separate definitions. Below is what I want to use in general. How do I fix it so it works with $lookupItem instead of hard-coding "OTHER-ERROR" for example in the Get-EpxAlarm function? I tried single quotes and double quotes around $lookupItem in the $fullregex definition, and it returns nothing.
Function Get-EpxAlarm{
[cmdletbinding()]
Param ( [string]$fileContentsToParse, [string]$lookupItem)
Process
{
$lookupItem = "OTHER_ERROR"
Write-Host "In epx Alarm" -ForegroundColor Cyan
# construct regex
$fullregex = [regex]'$lookupItem', # Start of error message########variable needed
":[\s\Sa-zA-Z]*?=", # match anything, non-greedy
"(?<epxAlarm>[\sa-zA-Z_0-9]*)", # Capture epxAlarm Num
'' -join ''
# run the regex
$Values = $fileContentsToParse | Select-String -Pattern $fullregex -AllMatches
# Convert Name-Value pairs to object properties
$result = $Values.Matches
Write-Host $result
#Write-Host "result:" $result -ForegroundColor Green
return $result
}#process
}#function
#main code
...
Get-EpxAlarm -fileContentsToParse $epxContents -lookupItem $item
...
where $fileContentsToParse is
case OTHER_ERROR:
bstrEpxErrorNum = FATAL_ERROR;
break;
case RI_FAILED:
case FILE_FAILED:
case COMMUNICATION_FAILURE:
bstrEpxErrorNum = RENDERING_ERROR;
break;
So if I look for OTHER_ERROR, it should return FATAL_ERROR.
I tested my regular expression in regex editor and it works with the hard-coded value. How can I define my regex so that I use the parameter and it returns the same thing as hard-coding the parameter value?
I wouldn't recommend trying to construct a single regular expression to do complex source code parsing - it gets quite unreadable really quickly.
Instead, write a small error mapping parser that just reads the source code line by line and constructs the error mapping table as it goes along:
function Get-EpxErrorMapping {
param([string]$EPXFileContents)
# create hashtable to hold the final mappings
$errorMap = #{}
# create array to collect keys that are grouped together
$keys = #()
switch -Regex ($EPXFileContents -split '\r?\n') {
'case (\w+):' {
# add relevant key to key collection
$keys += $Matches[1] }
'bstrEpxErrorNum = (\w+);' {
# we've reached the relevant error, set it for all relevant keys
foreach($key in $keys){
$errorMap[$key] = $Matches[1]
}
}
'break' {
# reset/clear key collection
$keys = #()
}
}
return $errorMap
}
Now all you need to do is call this function and use the resulting table to resolve the $lookupItem value:
Function Get-EpxAlarm{
[CmdletBinding()]
param(
[string]$fileContentsToParse,
[string]$lookupItem
)
$errorMap = Get-EpxErrorMapping $fileContentsToParse
return $errorMap[$lookupItem]
}
Now we can get the corresponding error code:
$epxContents = #'
case OTHER_ERROR:
bstrEpxErrorNum = FATAL_ERROR;
break;
case RI_FAILED:
case FILE_FAILED:
case COMMUNICATION_FAILURE:
bstrEpxErrorNum = RENDERING_ERROR;
break;
'#
# this will now return the string "FATAL_ERROR"
Get-EpxAlarm -fileContentsToParse $epxContents -lookupItem OTHER_ERROR

Matching Something Against Array List Using Where Object

I've found multiple examples of what I'm trying here, but for some reason it's not working.
I have a list of regular expressions that I'm checking against a single value and I can't seem to get a match.
I'm attempting to match domains. e.g. gmail.com, yahoo.com, live.com, etc.
I am importing a csv to get the domains and have debugged this code to make sure the values are what I expect. e.g. "gmail.com"
Regular expression examples AKA $FinalWhiteListArray
(?i)gmail\.com
(?i)yahoo\.com
(?i)live\.com
Code
Function CheckDirectoryForCSVFilesToSearch {
$global:CSVFiles = Get-ChildItem $Global:Directory -recurse -Include *.csv | % {$_.FullName} #removed -recurse
}
Function ImportCSVReports {
Foreach ($CurrentChangeReport in $global:CSVFiles) {
$global:ImportedChangeReport = Import-csv $CurrentChangeReport
}
}
Function CreateWhiteListArrayNOREGEX {
$Global:FinalWhiteListArray = New-Object System.Collections.ArrayList
$WhiteListPath = $Global:ScriptRootDir + "\" + "WhiteList.txt"
$Global:FinalWhiteListArray= Get-Content $WhiteListPath
}
$Global:ScriptRootDir = Split-Path -Path $psISE.CurrentFile.FullPath
$Global:Directory = $Global:ScriptRootDir + "\" + "Reports to Search" + "\" #Where to search for CSV files
CheckDirectoryForCSVFilesToSearch
ImportCSVReports
CreateWhiteListArrayNOREGEX
Foreach ($Global:Change in $global:ImportedChangeReport){
If (-not ([string]::IsNullOrEmpty($Global:Change.Previous_Provider_Contact_Email))){
$pos = $Global:Change.Provider_Contact_Email.IndexOf("#")
$leftPart = $Global:Change.Provider_Contact_Email.Substring(0, $pos)
$Global:Domain = $Global:Change.Provider_Contact_Email.Substring($pos+1)
$results = $Global:FinalWhiteListArray | Where-Object { $_ -match $global:Domain}
}
}
Thanks in advance for any help with this.
the problem with your current code is that you put the regex on the left side of the -match operator. [grin] swap that and your code otta work.
taking into account what LotPings pointed out about case sensitivity and using a regex OR symbol to make one test per URL, here's a demo of some of that. the \b is for word boundaries, the | is the regex OR symbol. the $RegexURL_WhiteList section builds that regex pattern from the 1st array. if i haven't made something clear, please ask ...
$URL_WhiteList = #(
'gmail.com'
'yahoo.com'
'live.com'
)
$RegexURL_WhiteList = -join #('\b' ,(#($URL_WhiteList |
ForEach-Object {
[regex]::Escape($_)
}) -join '|\b'))
$NeedFiltering = #(
'example.com/this/that'
'GMail.com'
'gmailstuff.org/NothingElse'
'NotReallyYahoo.com'
'www.yahoo.com'
'SomewhereFarAway.net/maybe/not/yet'
'live.net'
'Live.com/other/another'
)
foreach ($NF_Item in $NeedFiltering)
{
if ($NF_Item -match $RegexURL_WhiteList)
{
'[ {0} ] matched one of the test URLs.' -f $NF_Item
}
}
output ...
[ GMail.com ] matched one of the test URLs.
[ www.yahoo.com ] matched one of the test URLs.
[ Live.com/other/another ] matched one of the test URLs.

Use Powershell to comment out a 'codeblock' in a text file?

I'm trying to comment out some code in a massive amount of files
The files all contain something along the lines of:
stage(&apos;inrichting&apos;){
steps{
build job: &apos;SOMENAME&apos;, parameters: param
build job: &apos;SOMEOTHERNAME&apos;, parameters: param
echo &apos;TEXT&apos;
}
}
The things within the steps{ } is variable, but always exists out of 0..N 'echo' and 0..N 'build job'
I need an output like:
//stage(&apos;inrichting&apos;){
// steps{
// build job: &apos;SOMENAME&apos;, parameters: param
// build job: &apos;SOMEOTHERNAME&apos;, parameters: param
// echo &apos;TEXT&apos;
// }
//}
Is there any good way to do this with PowerShell? I tried some stuff with pattern.replace but didn't get very far.
$list = Get-ChildItem -Path 'C:\Program Files (x86)\Jenkins\jobs' -Filter config.xml -Recurse -ErrorAction SilentlyContinue -Force | % { $_.fullname };
foreach ($item in $list) {
...
}
This is a bit tricky, as you're trying to find that whole section, and then add comment markers to all lines in it. I'd probably write an ad-hoc parser with switch -regex if your structure allows for it (counting braces may make things more robust, but is also a bit harder to get right for all cases). If the code is regular enough you can perhaps reduce it to the following:
stage(&apos;inrichting&apos;){
steps{
... some amount of lines that don't contain braces
}
}
and we can then check for occurrence of the two fixed lines at the start and eventually two lines with closing braces:
foreach ($file in $list) {
# lines of the file
$lines = Get-Content $file
# line numbers to comment out
$linesToComment = #()
# line number of the current block to comment
$currentStart = -1
# the number of closing braces on single lines we've encountered for the current block
$closingBraces = 0
for ($l = 0; $l -le $lines.Count; $l++) {
switch -regex ($lines[$l]) {
'^\s*stage\(&apos;inrichting&apos;\)\{' {
# found the first line we're looking for
$currentStart = $l
}
'^\s*steps\{' {
# found the second line, it may not belong to the same block, so reset if needed
if ($l -ne $currentStart + 1) { $currentStart = -1 }
}
'^\s*}' {
# only count braces if we're at the correct point
if ($currentStart -ne -1) { $closingBraces++ }
if ($closingBraces -eq 2) {
# we've reached the end, add the range to the lines to comment out
$linesToComment += $currentStart..$l
$currentStart = -1
$closingBraces = 0
}
}
}
}
$commentedLines = 0..($lines.Count-1) | % {
if ($linesToComment -contains $_) {
'//' + $lines[$_]
} else {
$lines[$_]
}
} | Set-Content $file
}
Untested, but the general idea might work.
Update: fixed and tested

Validating file name input in Powershell

I would like to validate input for file name and check if it contains invalid characters, in PowerShell. I had tried following approach, which works when just one of these character is entered but doesn't seem to work when a given alpha-numeric string contains these characters. I believe I didn't construct regex correctly, what would be the right way to validate whether given string contains these characters? Thanks in advance.
#Validate file name whether it contains invalid characters: \ / : * ? " < > |
$filename = "\?filename.txt"
if($filename -match "^[\\\/\:\*\?\<\>\|]*$")
{Write-Host "$filename contains invalid characters"}
else
{Write-Host "$filename is valid"}
I would use Path.GetInvalidFileNameChars() rather than hardcoding the characters in a regex pattern, and then use the String.IndexOfAny() method to test if the file name contains any of the invalid characters:
function Test-ValidFileName
{
param([string]$FileName)
$IndexOfInvalidChar = $FileName.IndexOfAny([System.IO.Path]::GetInvalidFileNameChars())
# IndexOfAny() returns the value -1 to indicate no such character was found
return $IndexOfInvalidChar -eq -1
}
and then:
$filename = "\?filename.txt"
if(Test-ValidFileName $filename)
{
Write-Host "$filename is valid"
}
else
{
Write-Host "$filename contains invalid characters"
}
If you don't want to define a new function, this could be simplified as:
if($filename.IndexOfAny([System.IO.Path]::GetInvalidFileNameChars()) -eq -1)
{
Write-Host "$filename is valid"
}
else
{
Write-Host "$filename contains invalid characters"
}
To fix the regex:
Try removing the ^ and $ which anchor it to the ends of the string.

How to extract a substring in perl

I am new in perl and need your help.
I am reading the contents of files in a directory.
I need to extract the substring from the files containing *.dat
Sample strings:
1) # ** Template Name: IFDOS_ARCHIVE.dat
2) # ** profile for IFNEW_UNIX_CMD.dat template **
3) # ** Template IFWIN_MV.dat **
Need to Extract:
1) IFDOS_ARCHIVE.dat
2) IFNEW_UNIX_CMD.dat
3) IFWIN_MV.dat
My code:
if(open(my $jobprofile, "./profiles/$vitem[0].profile")) {
my #jobprofiletemp = <$jobprofile>;
close($jobprofile);
#proftemplates = grep /.dat/,#jobprofiletemp;
my $strproftemp = $proftemplates[0];
my ($tempksh) = $strproftemp =~ / ([^_]*)./;
print "tempksh: $tempksh","\n";
} else { warn "problem opening ./$_\n\n"; }
my regex is not working.
what do you suggest?
I think you'll be better with something like:
while (<$jobprofile>) {
if ( /(\S+)\.dat/ ) {
print "$1\n";
}
}
(the while is there to make sure you parse every single line)
The regular expressions looks for a sequence of non-white-space characters (\S) followed by .dat.
The parenthesis surrounding \S+ capture the match of that part into the special variable $1.
Try this
open my $fh,"<","file.txt";
while (<$fh>)
{
next if /^\s+/; #skip the loop for empty line
($match) = /\s(\w+\.dat)/g; # group the matching word and store the input into the $match
print "$match\n";
}
or simply try the perl one liner
perl -ne' print $1,"\n" if (m/(\w+\.dat)/) ' file.txt
Or you are working in linux try the linux command for to do it
grep -ohP '\w+\.dat' one.txt
o display the matched element only
h for display the output without filename
P for perl regex