Add capture group values in a PowerShell replace loop - regex

Needing to replace a string in multiple text files with the same string , except with capture group 2 replaced by the sum of itself and capture group 4.
String: Total amount $11.39 | Change $0.21
Desired Result: Total amount $11.60 | Change $0.21
I have attempted several methods. Here is my last attempt which seems to run without error, but without any changes to the string .
$Originalfolder = "$ENV:userprofile\Documents\folder\"
$Originalfiles = Get-ChildItem -Path "$Originalfolder\*"
$RegexPattern = '\b(Total\s\amount\s\$)(\d?\d?\d?\d?\d\.?\d?\d?)(\s\|\sChange\s\$)(\d?\d?\d\.?\d?\d?)\b'
$Substitution = {
Param($Match)
$Result = $GP1 + $Sumtotal + $GP3 + $Change
$GP1 = $Match.Groups[1].Value
$Total = $Match.Groups[2].Value
$GP3 = $Match.Groups[3].Value
$Change = $Match.Groups[4].Value
$Sumtotal = ($Total + $Change)
return [string]$Result
}
foreach ($file in $Originalfiles) {
$Lines = Get-Content $file.FullName
$Lines | ForEach-Object {
[Regex]::Replace($_, $RegexPattern, $Substitution)
} | Set-Content $file.FullName
}

For one thing, your regular expression doesn't even match what you're trying to replace, because you escaped the a in amount:
\b(Total\s\amount\s\$)(\d?\d?\d?...
# ^^
\a is an escape sequence that matches the "alarm" or "bell" character \u0007.
Also, if you want to calculate the sum of two captures you need to convert them to numeric values first, otherwise the + operator would just concatenate the two strings.
$Total = $Match.Groups[2].Value
$Change = $Match.Groups[4].Value
$Sumtotal = $Total + $Change # gives 11.390.21
$Sumtotal = [double]$Total + [double]$Change # gives 11.6
And you need to build $Result after you defined the other variables, otherwise the replacement function would just return an empty string.
Change this:
$RegexPattern = '\b(Total\s\amount\s\$)(\d?\d?\d?\d?\d\.?\d?\d?)(\s\|\sChange\s\$)(\d?\d?\d\.?\d?\d?)\b'
$Substitution = {
param ($Match)
$Result = $GP1 + $Sumtotal + $GP3 + $Change
$GP1 = $Match.Groups[1].Value
$Total = $Match.Groups[2].Value
$GP3 = $Match.Groups[3].Value
$Change = $Match.Groups[4].Value
$Sumtotal = ($Total + $Change)
return [string]$Result
}
into this:
$RegexPattern = '\b(Total\samount\s\$)(\d?\d?\d?\d?\d\.?\d?\d?)(\s\|\sChange\s\$)(\d?\d?\d\.?\d?\d?)\b'
$Substitution = {
Param($Match)
$GP1 = $Match.Groups[1].Value
$Total = [double]$Match.Groups[2].Value
$GP3 = $Match.Groups[3].Value
$Change = [double]$Match.Groups[4].Value
$Sumtotal = ($Total + $Change)
$Result = $GP1 + $Sumtotal + $GP3 + $Change
return [string]$Result
}
and the code will mostly do what you want. "Mostly", because it will not format the calculated number to double decimals. You need to do that yourself. Use the format operator (-f) and change your replacement function to something like this:
$Substitution = {
Param($Match)
$GP1 = $Match.Groups[1].Value
$Total = [double]$Match.Groups[2].Value
$GP3 = $Match.Groups[3].Value
$Change = [double]$Match.Groups[4].Value
$Sumtotal = $Total + $Change
return ('{0}{1:n2}{2}{3:n2}' -f $GP1, $Sumtotal, $GP3, $Change)
}
As a side note: the sub-expression \d?\d?\d?\d?\d\.?\d?\d? could be shortened to \d+(?:\.\d+)? (one or more digit, optionally followed by a period and one or more digits) or, more exactly, to \d{1,4}(?:\.\d{0,2})? (one to four digits, optionally followed by a period and up to 2 digits).

here's how I'd do it: this is pulled out of a larger script that regularly scans a directory for files, then does a similar manipulation, and I've changed variables quickly to obfuscate, so shout if it doesn't work and I'll take a more detailed look tomorrow.
It takes a backup of each file as well, and works on a temp copy before renaming.
Note it also sends an email alert (code at the end) to say if any processing was done - this is because it's designed to run as as scheduled task in the original
$backupDir = "$pwd\backup"
$stringToReplace = "."
$newString = "."
$files = #(Get-ChildItem $directoryOfFiles)
$intFiles = $files.count
$tmpExt = ".tmpDataCorrection"
$DataCorrectionAppend = ".DataprocessBackup"
foreach ($file in $files) {
$content = Get-Content -Path ( $directoryOfFiles + $file )
# Check whether there are any instances of the string
If (!($content -match $stringToReplace)) {
# Do nothing if we didn't match
}
Else {
#Create another blank temporary file which the corrected file contents will be written to
$tmpFileName_DataCorrection = $file.Name + $tmpExt_DataCorrection
$tmpFile_DataCorrection = $directoryOfFiles + $tmpFileName_DataCorrection
New-Item -ItemType File -Path $tmpFile_DataCorrection
foreach ( $line in $content ) {
If ( $line.Contains("#")) {
Add-Content -Path $tmpFile_DataCorrection -Value $line.Replace($stringToReplace,$newString)
#Counter to know whether any processing was done or not
$processed++
}
Else {
Add-Content -Path $tmpFile_DataCorrection -Value $line
}
}
#Backup (rename) the original file, and rename the temp file to be the same name as the original
Rename-Item -Path $file.FullName -NewName ($file.FullName + $DataCorrectionAppend) -Force -Confirm:$false
Move-Item -Path ( $file.FullName + $DataCorrectionAppend ) -Destination backupDir -Force -Confirm:$false
Rename-Item -Path $tmpFile_DataCorrection -NewName $file.FullName -Force -Confirm:$false
# Check to see if anything was done, then populate a variable to use in final email alert if there was
If (!$processed) {
#no message as did nothing
}
Else {
New-Variable -Name ( "processed" + $file.Name) -Value $strProcessed
}
} # Out of If loop
}

Related

Powershell: Replace only fist occurence of a line/string in entire file

I have following beggining of a Powershell script in which I would like to replace the values of variables for different enviroment.
$SomeVar1 = "C:\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewriten
$SomeVar2 = "C:\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
When I run the rewrite script, the result should look like this:
$SomeVar1 = "F:\different\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewrite
$SomeVar2 = "F:\different\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
Current script that does(n't) rewrite:
$arr = #(
[PSCustomObject]#{Regex = '$SomeVar1 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\a"'}
[PSCustomObject]#{Regex = '$SomeVar2 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\b"'}
)
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
# Get full line for replacement
$Line = Get-Content $Workfile | Select-String $ArrRegex | Select-Object -First 1 -ExpandProperty Line
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile
$Content -replace "^$Line",$ArrReplace | Set-Content $Workfile
}
This replaces all the occurences in file on the start of the line (and I need only the 1st one) and doest not replace the one in Note which is okay.
Then I found this Powershell: Replace last occurence of a line in a file which does the exact oposite of what I need, only rewrites the last occurence of the string and it does it in the Note aswell and I would somehow like to change it to do the opposite - 1st occurence, line begining (Wont target the Note)
Code in my case looks like this:
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile -Raw
$Line = "(?s)(.*)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line,$ArrReplace | Set-Content $Workfile
Do you have any recommendations on how to archive my goal, or is there a more sothisticated way to replace variables for powershell scripts like this?
Thanks in advance.
So I finally figured it out, I had to add Select-String "^$ArrRegex" during $Line creation to exclude any string that were on on line beggining and then use this Regex to do the job: ^(?s)(.*?\n)$Line
In my case it does the following: Only selects 1st occurnece on the beggining of the line and replaces it. It ignores everything else and when re-run, does not rewrite others. The copies of vars will not really exist in final version and will be set once like $Var1 = "Value" and never changed during script, but I wanted to be sure that I won't replace something by mistake.
The final replacing part does look like this:
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
$Line = Get-Content $Workfile | Select-String "^$ArrRegex" | Select-Object -First 1 -ExpandProperty Line
$Line = [Regex]::Escape($Line)
$Line = "^(?s)(.*?\n)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line, $ArrReplace | Set-Content $Workfile
}
You could possibly use flag variables like below to only do the first replacement for each of your regex patterns.
$Altered = Get-Content -Path $Workfile |
Foreach-Object {
if(-not $a) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX1','YOUR_REPLACEMENT1'
if($_ -match 'YOUR_REPLACEMENT1') { $a = 'replacement done' } #Set Flag
}
if(-not $b) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX2','YOUR_REPLACEMENT2'
if($_ -match 'YOUR_REPLACEMENT2') { $b = 'replacement done' } #Set Flag
}
$_ # Pipe back to $Altered
}
$Altered | Set-Content -Path $WorkFile
Just reverse the RegEx, if that is what you are after:
Clear-Host
#'
abc
abc
abc
'# -replace '^(.*?)\babc\b', '$1HelloWorld'
# Results
<#
HelloWorld
abc
abc
#>

Find Pattern (Not exact string) in .XLSX with Powershell

I can find exact strings but I can't seem to find the correct function or syntax to find a pattern, for example [0-9] in an .xlsx. I can find that exact string but not matches for that pattern, which is supposed to be just a digit between 0 and 9. The reason for this is because I am using the Find function and that matches exact strings. I know this is possible to find a pattern but just cant seem to figure it out. I have to call the open of Excel due to my initial Get-ChildItem script does not work with .xlsx files. Below is the code. Any help or ideas will be greatly appreciated. I have put 3 asterisks where I think the issue is but I just can't see what the solution is.
$SearchText = '[0-9]'
$path = "C:\users\username\desktop"
$output = "c:\users\username\desktop\results.txt"
$files = Get-Childitem $path -Include *.xlsx, *.xlsm, *.xlsb -Recurse
Function Search-Excel {
$Excel = New-Object -ComObject Excel.Application
ForEach($file in $files)
{
$Workbook = $Excel.Workbooks.Open($file)
ForEach ($Worksheet in #($Workbook.Sheets)) {
***$Found = $WorkSheet.Cells.Find($SearchText)***
If ($Found) {
$BeginAddress = $Found.Address(0,0,1,1)
[pscustomobject]#{
FilePath = $Workbook.Path
FileName = $Workbook.Name
WorkSheet = $Worksheet.Name
Column = $Found.Column
Row = $Found.Row
Text = $Found.Text
Address = $BeginAddress
}
Do {
$Found = $WorkSheet.Cells.FindNext($Found)
$Address = $Found.Address(0,0,1,1)
If ($Address -eq $BeginAddress) {
BREAK
}
[pscustomobject]#{
FilePath = $Workbook.Path
FileName = $Workbook.Name
WorkSheet = $Worksheet.Name
Column = $Found.Column
Row = $Found.Row
Text = $Found.Text
Address = $Address
}
} Until ($False)
$workbook.Close($false) }
Else {
Write-Warning "[$($WorkSheet.Name)] Nothing Found!"
}}
}
[void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$excel)
[gc]::Collect()
[gc]::WaitForPendingFinalizers()
Remove-Variable excel -ErrorAction SilentlyContinue
}
Search-Excel | Out-File $output -Append

Skip Header Row in a High Performance Powershell Regex Script Block

I received some amazing help from Stack Overflow ... however ... it was so amazing I need a little more help to get to closer to the finish line. I'm parsing multiple enormous 4GB files 2X per month. I need be able to be able to skip the header, count the total lines, matched lines, and the not matched lines. I'm sure this is super-simple for a PowerShell superstar, but at my newbie PS level my skills are not yet strong. Perhaps a little help from you would save the week. :)
Data Sample:
ID FIRST_NAME LAST_NAME COLUMN_NM_TOO_LON5THCOLUMN
10000000001MINNIE MOUSE COLUMN VALUE LONGSTARTS
10000000002MICKLE ROONEY MOUSE COLUMN VALUE LONGSTARTS
Code Block (based on this answer):
#$match_regex matches each fixed length field by length; the () specifies that each matched field be stored in a capture group:
[regex]$match_regex = '^(.{10})(.{50})(.{50})(.{50})(.{50})(.{3})(.{8})(.{4})(.{50})(.{2})(.{30})(.{6})(.{3})(.{4})(.{25})(.{2})(.{10})(.{3})(.{8})(.{4})(.{50})(.{2})(.{30})(.{6})(.{3})(.{2})(.{25})(.{2})(.{10})(.{3})(.{10})(.{10})(.{10})(.{2})(.{10})(.{50})(.{50})(.{50})(.{50})(.{8})(.{4})(.{50})(.{2})(.{30})(.{6})(.{3})(.{2})(.{25})(.{2})(.{10})(.{3})(.{4})(.{2})(.{4})(.{10})(.{38})(.{38})(.{15})(.{1})(.{10})(.{2})(.{10})(.{10})(.{10})(.{10})(.{38})(.{38})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})(.{10})$'
Measure-Command {
& {
switch -File $infile -Regex {
$match_regex {
# Join what all the capture groups matched with a tab char.
$Matches[1..($Matches.Count-1)].Trim() -join "`t"
}
}
} | Out-File $outFile
}
You only need to keep track of two counts - matched, and unmatched lines - and then a Boolean to indicate whether you've skipped the first line
$first = $false
$matched = 0
$unmatched = 0
. {
switch -File $infile -Regex {
$match_regex {
if($first){
# Join what all the capture groups matched with a tab char.
$Matches[1..($Matches.Count-1)].Trim() -join "`t"
$matched++
}
$first = $true
}
default{
$unmatched++
# you can remove this, if the pattern always matches the header
$first = $true
}
}
} | Out-File $outFile
$total = $matched + $unmatched
Using System.IO.StreamReader reduced the processing time to about 20% of what it had been. This was absolutely needed for my requirement.
I added logic and counters without sacrificing much on performance. The field counter and row by row comparison is particularly helpful in finding bad records.
This is a copy/paste of actual code but I shortened some things, made some things slightly pseudo code, so you may have to play with it to get things working just so for yourself.
Function Get-Regx-Data-Format() {
Param ([String] $filename)
if ($filename -eq 'FILE NAME') {
[regex]$match_regex = '^(.{10})(.{10})(.{10})(.{30})(.{30})(.{30})(.{4})(.{1})'
}
return $match_regex
}
Foreach ($file in $cutoff_files) {
$starttime_for_file = (Get-Date)
$source_file = $file + '_' + $proc_yyyymm + $source_file_suffix
$source_path = $source_dir + $source_file
$parse_file = $file + '_' + $proc_yyyymm + '_load' +$parse_target_suffix
$parse_file_path = $parse_target_dir + $parse_file
$error_file = $file + '_err_' + $proc_yyyymm + $error_target_suffix
$error_file_path = $error_target_dir + $error_file
[regex]$match_data_regex = Get-Regx-Data-Format $file
Remove-Item -path "$parse_file_path" -Force -ErrorAction SilentlyContinue
Remove-Item -path "$error_file_path" -Force -ErrorAction SilentlyContinue
[long]$matched_cnt = 0
[long]$unmatched_cnt = 0
[long]$loop_counter = 0
[boolean]$has_header_row=$true
[int]$field_cnt=0
[int]$previous_field_cnt=0
[int]$array_length=0
$parse_minutes = Measure-Command {
try {
$stream_log = [System.IO.StreamReader]::new($source_path)
$stream_in = [System.IO.StreamReader]::new($source_path)
$stream_out = [System.IO.StreamWriter]::new($parse_file_path)
$stream_err = [System.IO.StreamWriter]::new($error_file_path)
while ($line = $stream_in.ReadLine()) {
if ($line -match $match_data_regex) {
#if matched and it's the header, parse and write to the beg of output file
if (($loop_counter -eq 0) -and $has_header_row) {
$stream_out.WriteLine(($Matches[1..($array_length)].Trim() -join "`t"))
} else {
$previous_field_cnt = $field_cnt
#add year month to line start, trim and join every captured field w/tabs
$stream_out.WriteLine("$proc_yyyymm`t" + `
($Matches[1..($array_length)].Trim() -join "`t"))
$matched_cnt++
$field_cnt=$Matches.Count
if (($previous_field_cnt -ne $field_cnt) -and $loop_counter -gt 1) {
write-host "`nError on line $($loop_counter + 1). `
The field count does not match the previous correctly `
formatted (non-error) row."
}
}
} else {
if (($loop_counter -eq 0) -and $has_header_row) {
#if the header, write to the beginning of the output file
$stream_out.WriteLine($line)
} else {
$stream_err.WriteLine($line)
$unmatched_cnt++
}
}
$loop_counter++
}
} finally {
$stream_in.Dispose()
$stream_out.Dispose()
$stream_err.Dispose()
$stream_log.Dispose()
}
} | Select-Object -Property TotalMinutes
write-host "`n$file_list_idx. File $file parsing results....`nMatched Count =
$matched_cnt UnMatched Count = $unmatched_cnt Parse Minutes = $parse_minutes`n"
$file_list_idx++
$endtime_for_file = (Get-Date)
write-host "`nEnded processing file at $endtime_for_file"
$TimeDiff_for_file = (New-TimeSpan $starttime_for_file $endtime_for_file)
$Hrs_for_file = $TimeDiff_for_file.Hours
$Mins_for_file = $TimeDiff_for_file.Minutes
$Secs_for_file = $TimeDiff_for_file.Seconds
write-host "`nElapsed Time for file $file processing:
$Hrs_for_file`:$Mins_for_file`:$Secs_for_file"
}
$endtime = (Get-Date -format "HH:mm:ss")
$TimeDiff = (New-TimeSpan $starttime $endtime)
$Hrs = $TimeDiff.Hours
$Mins = $TimeDiff.Minutes
$Secs = $TimeDiff.Seconds
write-host "`nTotal Elapsed Time: $Hrs`:$Mins`:$Secs"

How do I use the values (read from a .ps1 file) to update the values of another .ps1 file

I have a 4.ps1 file that looks like this
#akabradabra
$one = 'o'
#bibi
$two = 't'
$three = 't' #ok thr
#four
$four = 'four'
And a 3.ps1 file that looks like this
#akabradabra
$one = 'one'
#biblibablibo
$two = 'two'
$three = 'three' #ok threer
My goal is to read the key-value pair from 4.ps1 and update the values in 3.ps1 and if new key-value pairs are introduced in 4.ps1, simply append them to the end of 3.ps1.
My idea is to use string functions such as .Split('=') and .Replace(' ', '') to extract the keys and if the keys match, replace the entire line in 3.ps1 with the one found in 4.ps1
I know that using Get-Variable might does the trick and also it will be a lot easier to work with the data if I convert all the key-value pairs into a .xml or a .json file but can anyone please show me how can I make it work in my own silly way?
Here is my code to do so
# Ignore this function, this is used to skip certain key-value pairs
#----------------------------------------------------------------------------
Function NoChange($something) {
switch ($something) {
'$CurrentPath' {return $true}
'$pathToAdmin' {return $true}
'$hostsPathTocompare' {return $true}
'$logs' {return $true}
'$LogFile' {return $true}
default {return $false}
}
}
#----------------------------------------------------------------------------
$ReadFromVARS = Get-Content $PSScriptRoot\4.ps1
$WriteToVARS = Get-Content $PSScriptRoot\3.ps1
foreach ($oldVar in $ReadFromVARS) {
if (('' -eq $oldVar) -or ($oldVar -match '\s*#+\w*')) {
continue
} elseif ((NoChange ($oldVar.Split('=').Replace(' ', '')[0]))) {
continue
} else {
$var = 0
#$flag = $false
while ($var -ne $WriteToVARS.Length) {
if ($WriteToVARS[$var] -eq '') {
$var += 1
continue
} elseif ($WriteToVARS[$var] -match '\s*#+\w*') {
$var += 1
continue
} elseif ($oldVar.Split('=').Replace(' ', '')[0] -eq $WriteToVARS[$var].Split('=').Replace(' ', '')[0]<# -and !$flag#>) {
$oldVar
$WriteToVARS.replace($WriteToVARS[$var], $oldVar) | Set-Content -Path $PSScriptRoot\3.ps1 -Force
break
#$var += 1
#$flag = $true
} elseif (<#!$flag -and #>($var -eq $WriteToVARS.Length)) {
Add-Content -Path $PSScriptRoot\3.ps1 -Value $oldVar -Force
$var += 1
} else {
$var += 1
}
}
}
}
I did not ran into any errors but it only updated one key-value pair ($two = t) and it did not append new key-value pairs at the end. Here is the result I got
#akabradabra
$one = 'one'
#biblibablibo
$two = 't'
$three = 'three' #ok threer
If I understand your question correctly, I think Dot-Sourcing is what you're after.
The PowerShell dot-source operator brings script files into the current session scope. It is a way to reuse script. All script functions and variables defined in the script file become part of the script it is dot sourced into. It is like copying and pasting text from the script file directly into your script.
To make it visible, use Dot-Sourcing to read in the variables from file 3.ps1, show the variables and their values. Next dot-source file 4.ps1 and show the variables again:
. 'D:\3.ps1'
Write-Host "Values taken from file 3.ps1" -ForegroundColor Yellow
"`$one : $one"
"`$two : $two"
"`$three : $three"
"`$four : $four" # does not exist yet
. 'D:\4.ps1'
Write-Host "Values after dot-sourcing file 4.ps1" -ForegroundColor Yellow
"`$one : $one"
"`$two : $two"
"`$three : $three"
"`$four : $four"
The result is
Values taken from file 3.ps1
$one : one
$two : two
$three : three
$four :
Values after dot-sourcing file 4.ps1
$one : o
$two : t
$three : t
$four : four
If you want to write these variables back to a ps1 script file you can:
'one','two','three','four' | Get-Variable | ForEach-Object {
'${0} = "{1}"' -f $_.Name,$_.Value
} | Set-Content 'D:\5.ps1' -Force
Theo's answer provides a easier way to do the same thing
Also, converting your Config files to JSON or XML will make the job lot more easier too
My original idea was to read both 4.ps1 and 3.ps1 ( these are my config files, I only store variables inside and switch statement to help choosing the correct variables ) then overwrite 3.ps1 with all the difference found but I could not get it working so I created a new 5.ps1 and just simply write everything I need to 5.ps1.
Here is my code if you would like to use it for your own project :-)
The obstacles for me were that I had switch statements and certain $variables that I wanted to ignore (in my actual project) so I used some Regex to avoided it.
$ReadFromVARS = Get-Content $PSScriptRoot\4.ps1
$WriteToVARS = Get-Content $PSScriptRoot\3.ps1
New-Item -ItemType File -Path $PSScriptRoot\5.ps1 -Force
Function NoChange($something) {
switch ($something) {
'$CurrentPath' {return $true}
'$pathToAdmin' {return $true}
'$hostsPathTocompare' {return $true}
'$logs' {return $true}
'$LogFile' {return $true}
default {return $false}
}
}
$listOfOldVars = #()
$switchStatementStart = "^switch(\s)*\(\`$(\w)+\)(\s)*(\n)*\{"
$switchStatementContent = "(\s)*(\n)*(\t)*\'\w+(\.\w+)+\'(\s)*\{(\s)*\`$\w+(\s)*=(\s)*\#\((\s)*\'\w+(\.\w+)+\'(\s)*(,(\s)*\'\w+(\.\w+)+\'(\s)*)*\)\}"
$switchStatementDefault = "(\s)*(\n)*(\t)*Default(\s)*\{\`$\w+(\s)*=(\s)*\#\((\s)*\'\w+(\.\w+)+\'(\s)*(,(\s)*\'\w+(\.\w+)+\'(\s)*)*\)\}\}"
$switchStatementEnd = "(\s)*(\n)*(\t)*\}"
foreach ($oldVar in $ReadFromVARS) {
if (('' -eq $oldVar) -or ($oldVar -match '^#+\w*')) {
continue
} elseif ((NoChange $oldVar.Split('=').Replace(' ', '')[0])) {
continue
} else {
$var = 0
while ($var -ne $WriteToVARS.Length) {
if ($WriteToVARS[$var] -eq '') {
$var += 1
continue
} elseif ($WriteToVARS[$var] -match '^#+\w*') {
$var += 1
continue
} elseif ($oldVar -match $switchStatementStart -or $oldVar -match $switchStatementContent -or $oldVar -match $switchStatementDefault -or $oldVar -match $switchStatementEnd) {
Add-Content -Path "$PSScriptRoot\5.ps1" -Value $oldVar -Force
$listOfOldVars += ($oldVar)
break
} elseif ($oldVar.Split('=').Replace(' ', '')[0] -eq $WriteToVARS[$var].Split('=').Replace(' ', '')[0]) {
Add-Content -Path "$PSScriptRoot\5.ps1" -Value $oldVar -Force
$listOfOldVars += ($oldVar.Remove(0,1).Split('=').Replace(' ', '')[0])
break
} else {
$var += 1
}
}
}
}
foreach ($newVar in $WriteToVARS) {
if ($newVar.StartsWith('#') -or $newVar -eq '') {
continue
} elseif ($newVar -match $switchStatementStart -or $newVar -match $switchStatementContent -or $newVar -match $switchStatementDefault -or $newVar -match $switchStatementEnd) {
} elseif (($newVar.Remove(0,1).Split('=').Replace(' ', '')[0]) -in $listOfOldVars) {
continue
} else {
Add-Content -Path "$PSScriptRoot\5.ps1" -Value $newVar -Force
}
}

Special characters passed as parameters in URL

Requirement: Need way to handle Special characters like % and &. Need to tweak code below so that Special characters which come via $Control file are treated as it is.
For example: I have one of entry in $control file as 25% Dextrose(25ml). I need a way so that $ie.Navigate should simply navigate to https://www.xxxy.com/search/all?name=25% Dextrose(25ml). Currently it gets routed to https://www.xxxy.com/search/all?name=25%% Dextrose(25ml) (note a extra % in URL) and thus does not find that web-page.
**Few examples of special characters that need to be tackled:**
'/' - 32care Mouth/Throat
'%' - 3d1% Gel(30g)
'&' - Accustix Glucose & Protein
'/' - Ace Revelol(25/(2.5mg)
function getStringMatch
{
# Loop through all 2 digit combinations in the $path directory
foreach ($control In $controls)
{
$ie = New-Object -COMObject InternetExplorer.Application
$ie.visible = $true
$site = $ie.Navigate("https://www.xxxy.com/search/all?name=$control")
$ie.ReadyState
while ($ie.Busy -and $ie.ReadyState -ne 4){ sleep -Milliseconds 100 }
$link = $null
$link = $ie.Document.get_links() | where-object {$_.innerText -eq "$control"}
$link.click()
while ($ie.Busy -and $ie.ReadyState -ne 4){ sleep -Milliseconds 100 }
$ie2 = (New-Object -COM 'Shell.Application').Windows() | ? {
$_.Name -eq 'Windows Internet Explorer' -and $_.LocationName -match "^$control"
}
# NEED outerHTML of new page. CURRENTLY it is working for some.
$ie.Document.body.outerHTML > d:\med$control.txt
}
}
$controls = "Sporanox"
getStringMatch
You want to URL encode the URI. Add this at the very start:
Add-Type -AssemblyName 'System.Web'
And then encode the URL like this:
$controlUri = [System.Web.HttpUtility]::UrlEncode($control)
$site = $ie.Navigate("https://www.xxxy.com/search/all?name=$controlUri")
As Biffen pointed out, Web servers will treat special characters as codes. So in your case, $control needs to be modified so that the Web server understands where you want to go.
One way to fix it is the look for specific characters in the original product name you are looking for, and replace them with something that the server will understand:
Here is the entire code:
function getStringMatch
{
# Loop through all 2 digit combinations in the $path directory
foreach ($control In $controls)
{
$ie = New-Object -COMObject InternetExplorer.Application
$ie.visible = $true
$s = $control -replace '%','%25'
$s = $s -replace ' ','+'
$s = $s -replace '&','%26'
$s = $s -replace '/','%2F'
$site = $ie.Navigate("https://www.xxxy.com/search/all?name=$s")
while ($ie.Busy -and $ie.ReadyState -ne 4){ sleep -Milliseconds 100 }
$link = $null
$link = $ie.Document.get_links() | where-object {if ($_.innerText){$_.innerText.contains($control)}}
$link.click()
while ($ie.Busy){ sleep -Milliseconds 100 }
$ie.Document.body.outerHTML > d:\TEMP\med$control.txt
}
}
$controls = "Accustix Glucose & Protein"
getStringMatch
I tried with the following strings:
"3d1% Gel(30g)"
"Ace Revelol(25/2mg)"
"Accustix Glucose & Protein"
"32care Mouth/Throat"