Regular expression only finding first match - regex

I'm working on something that is similar to other designs I've done, but for some reason, it's only finding the first key/value pair, whereas other ones found all of them. It looks good in regex101.com, which is where I typically test these.
I'm parsing c++ code to get what I need for a reference spreadsheet for error tracking across a system, and results go into a spreadsheet, or is used as a key to lookup info in another file. I do something similar for about 20 files, plus there's other data coming from a sql query, or access/mdb file. The data for this file looks like this:
m_ErrorMap.insert(make_pair(
MAKEWORD(scError,seFatal),
HOP_FATAL_ERROR ));
m_ErrorMap.insert(make_pair(
MAKEWORD(scError,seNotSelected),
HOP_NOT_SELECTED));
m_ErrorMap.insert(make_pair(
MAKEWORD(scError,seCoverOpen),
HOP_COVER_OPEN ));
m_ErrorMap.insert(make_pair(
MAKEWORD(scError,seLeverPosition),
HOP_LEVER_POSITION ));
m_ErrorMap.insert(make_pair(
MAKEWORD(scError,seJam),
HOP_JAM ));
I read this as a string from the file (looks good), and feed it into this Function as $fileContent:
Function Get-Contents60{
[cmdletbinding()]
Param ([string]$fileContent)
Process
{
#m_ErrorMap.insert(make_pair(
#MAKEWORD(scError,seJam),
#HOP_JAM ));
# construct regex
switch -Regex ($fileContent -split '\r?\n') { #this is splitting on each line test regex with https://regex101.com/
'MAKEWORD["("][\w]+,(\w+)[")"],' { #seJam
# add relevant key to key collection
$keys = $Matches[1] } #only match once
',(HOP.*?)[\s]' { # HOP_JAM
# we've reached the relevant error, set it for all relevant keys
foreach($key in $keys){
Write-Host "60 key: $key"
Write-Host "Matches[0]: $($Matches[0]) Matches[1]: $($Matches[1])"
$errorMap[$key] = $($Matches[1])
Write-Host "60 key: $key ... value: $($errorMap[$key])"
}
}
'break' {
# reset/clear key collection
$keys = #()
}
}#switch
#Write-Host "result:" $result -ForegroundColor Green
#$result;
return $errorMap
}#End of Process
}#End of Function
I stepped through it in VSCode, and its finding the first key/value pair, and after that it's not finding anything. I looked at it in regex101.com, and it's finding line endings/breaks, and the MAKEWORD regex and HOP regex are finding what they should on each line it should.
I'm not sure if the issue is that they aren't all in the same line, and maybe I need to change it so it doesn't break on newline and breaks on something else for each key/value pair? I'm a little fuzzy on this.
I'm using powershell 5.1, and VSCode.
Update:
I modified Theo's answer and it worked great. I had simplified the class name from m_HopErrorMap to m_ErrorMap for this question, and the regular expression was grabbing that for each one. I modified that slightly, and Theo's works.
function Get-Contents60{
[cmdletbinding()]
Param ([string]$fileContent)
# create an ordered hashtable to store the results
$errorMap = [ordered]#{}
# process the lines one-by-one
switch -Regex ($fileContent -split '\r?\n') {
'MAKEWORD\([^,]+,([^)]+)\),' { # seJam, seFatal etc.
$key = $matches[1]
}
'(HOP_[^)]+)' {
$errorMap[$key] = $matches[1].Trim()
}
}
# output the completed data as object
[PsCustomObject]$errorMap
return $errorMap
}

I would simplify your function to
function Get-Contents60{
[cmdletbinding()]
Param ([string]$fileContent)
# create an ordered hashtable to store the results
$errorMap = [ordered]#{}
# process the lines one-by-one
switch -Regex ($fileContent -split '\r?\n') {
'MAKEWORD\([^,]+,([^)]+)\),' { # seJam, seFatal etc.
$key = $matches[1]
}
'(HOP[^)]+)' {
$errorMap[$key] = $matches[1].Trim()
}
}
# output the completed data as object
[PsCustomObject]$errorMap
}
Then, using your example text, for which I'm using a Here-string, but in real life you would load the file content with $c = Get-Content -Path 'X:\TheErrors.txt' -Raw you do
$result = Get-Contents60 -fileContent $c
To display on screen
$result | Format-Table -AutoSize
giving you
seFatal seNotSelected seCoverOpen seLeverPosition seJam
------- ------------- ----------- --------------- -----
HOP_FATAL_ERROR HOP_NOT_SELECTED HOP_COVER_OPEN HOP_LEVER_POSITION HOP_JAM

Related

Powershell Compare value to multiple arrays

I am trying to compare data to multiple sources and then give me a report of the errors. Due to the changing nature of exceptions, I wanted to build an exception table in csv format that I can change on the fly.
I am going to give the data the best I can and show you what I'm trying to achieve and show you where I'm coming into problems.
The exceptions list holds the prefix to different types of accounts:
Exceptions List
_______________
FQ
Q
HQ
E
So if my Account was BND123 then I may have an account called FQBND123 or QBND123 I want to be able to add to this list if one of the teams decides they need to make a JQ account or anything like that in the future.
This is an example of Inventoryreport.csv I'm looking to parse:
Safe Target system user name
HUMAN_ABC QABC
HUMAN_CDE QCDE
HUMAN_FGHIJ QFGHIJ
HUMAN_P123456 root
HUMAN_KLMNO QKLMNO1
HUMAN_P789123 FQ789123
So I am looking to compare target system username to the safe name, and if the leading account is in the exception list, it passes it up, and if it does not, then it throws it as an error.
So in the case of the data above the 2 rows would throw an error below.
HUMAN_P123456 root
HUMAN_KLMNO QKLMNO1
Root for obvious reason and the KLMNO account because of the trailing 1.
The problem I am getting is that it is saying everything is an error. If I hand type it in to the loop everything is fine.
I had the exceptions in a foreach loop too inside the one for the inventory, but it keep looping over the same results and still spitting out everything.
Hopefully this is an OK explanation, I'm sure I'm making this harder than it needs to be.
$loc = $scriptPath = split-path -parent $MyInvocation.MyCommand.Definition
$D = $loc + "\Exceptions\Exceptions.csv"
$E = $loc + "\Import\InventoryReport.csv"
$exceptions = Import-Csv -LiteralPath $D
$inventory = Import-Csv -LiteralPath $E
$list2 = 'Inventory Report Exceptions'
$list3 = 'Target system user name'
$DO = $loc + "\Report\Inventory Report Errors" + "$((Get-Date).ToString('MM-dd-yyyy')).CSV"
$time = (Get-Date).ToString()
foreach ($item in $inventory) {
$input1 = $item.Safe -replace "HUMAN_"
$input4 = $item.Safe -replace "HUMAN_P"
$input2 = $item.$list3
$input3 = $item.Safe
if ($input2 -eq ($exceptions.$list2 + $input1) -or $input2 -eq ($exceptions.$list2 + $input4)) {
return
}
else {
$newitem = New-Object -TypeName PSCustomObject -Property #{
Safe = $input1
Owner = $input2
}| Export-CSV -LiteralPath $DO -NoTypeInformation -append
}
}
Your question is a bit long and not very clear...
Let's look if I got it right:
I shortened the exceptions list to a regular expression anchored at begin
simulate the inventory.csv with a here string
append a column Pass to that
iterate the entries comparing the split'ed values for equality and save in the new col.
## Q:\Test\2018\11\02\SO_53109141.ps1
$Inventory = #"
Safe,Target system user name
HUMAN_ABC,QABC
HUMAN_CDE,QCDE
HUMAN_FGHIJ,QFGHIJ
HUMAN_P123456,root
HUMAN_KLMNO,QKLMNO1
HUMAN_P789123,FQ789123
"# | ConvertFrom-Csv
#$Exception = [regex]'^(FQ|Q|HQ|E)'
$Exception = [RegEx]("^("+((Import-Csv .\exceptions.csv).'Exceptions List' -join '|')+")")
$Fail = $Inventory | Select-Object *,Pass | ForEach-Object {
if ( ($_.Safe -split '_P?')[1] -ne ($_.'Target system user name' -split $Exeption)[2]){
[PSCustomObject]#{
Safe = ($_.Safe -split '_')[1]
Owner= $_.'Target system user name'
}
}
}
$Fail | Export-Csv '.\new.csv' -NoTypeInformation
Sample output:
Safe Target system user name Pass
---- ----------------------- ----
HUMAN_ABC QABC True
HUMAN_CDE QCDE True
HUMAN_FGHIJ QFGHIJ True
HUMAN_P123456 root False
HUMAN_KLMNO QKLMNO1 False
HUMAN_P789123 FQ789123 True
EDIT you can read in the exception from a file:
> import-csv .\exceptions.csv
Exceptions List
---------------
FQ
Q
HQ
E
And build a RegEx from the content:
$Exception = [RegEx]("^("+((Import-Csv .\exceptions.csv).'Exceptions List' -join '|')+")")

Create custom POST-request body in PowerShell

I am running into a bit of trouble. I am trying to do a POST-request with PowerShell. The problem is that the request-body uses the same key (you can upload multiple images), multiple times, so I can't build a hashtable to send the request. So the requestbody looks like this:
name value
image 1.jpg
image 2.jpg
subject this is the subject
message this is a message
A user with a similar problem (but not the same context) asked this before, and got as a response to use a List with KeyValuePair class. See https://stackoverflow.com/a/5308691/4225082
I cannot seem to create this. I found this https://bensonxion.wordpress.com/2012/04/27/using-key-value-pairs-in-powershell-2/
They use $testDictionary=New-Object “System.Collections.Generic.Dictionary[[System.String],[System.String]]”
to make the dictionary, but this doesn't translate to a list.
I managed to create (what I think is needed) by using $r = New-Object "System.Collections.Generic.List[System.Collections.Generic.KeyvaluePair[string,string]]"
and created a key by using $s = New-Object “System.Collections.Generic.KeyvaluePair[string,string]", but I can't set the values of that key.
I also tried creating a FormObject, but you also can't use the same key multiple times.
What is the best and/or easiest way to do this?
I am going to answer my own question. Because of the research, I managed to use better search terms, and found someone with exactly the same problem:
Does Invoke-WebRequest support arrays as POST form parameters?
I got rid of a bug (?) by changing [HttpWebResponse] to [System.Net.HttpWebResponse] and added the -WebSession parameter. I only needed it for the cookie, so I implemented that and didn't bother about the other stuff, it might need some tweaking for someone else!
This seemed to work at first glance, BUT for elements with the same key, it created an array, which messed up the order of the requestbody. Without the right order, the website won't accept it.
I messed around a bit more, and now I edited it to make use of multidimensional arrays.
So I ended up with this (all credits to the original writer!):
function Invoke-WebRequestEdit
{
[CmdletBinding()]
Param
(
[Parameter(Mandatory=$true)][System.Uri] $Uri,
[Parameter(Mandatory=$false)][System.Object] $Body,
[Parameter(Mandatory=$false)][Microsoft.PowerShell.Commands.WebRequestMethod] $Method,
[Parameter(Mandatory=$false)][Microsoft.PowerShell.Commands.WebRequestSession] $WebSession
# Extend as necessary to match the signature of Invoke-WebRequest to fit your needs.
)
Process
{
# If not posting a NameValueCollection, just call the native Invoke-WebRequest.
if ($Body -eq $null -or $body.GetType().BaseType -ne [Array]) {
Invoke-WebRequest #PsBoundParameters
return;
}
$params = "";
$i = 0;
$j = $body.Count;
$first = $true;
foreach ($array in $body){
if (!$first) {
$params += "&";
} else {
$first = $false;
}
$params += [System.Web.HttpUtility]::UrlEncode($array[0]) + "=" + [System.Web.HttpUtility]::UrlEncode($array[1]);
}
$b = [System.Text.Encoding]::UTF8.GetBytes($params);
# Use HttpWebRequest instead of Invoke-WebRequest, because the latter doesn't support arrays in POST params.
$req = [System.Net.HttpWebRequest]::Create($Uri);
$req.Method = "POST";
$req.ContentLength = $params.Length;
$req.ContentType = "application/x-www-form-urlencoded";
$req.CookieContainer = $WebSession.Cookies
$str = $req.GetRequestStream();
$str.Write($b, 0, $b.Length);
$str.Close();
$str.Dispose();
[System.Net.HttpWebResponse] $res = $req.GetResponse();
$str = $res.GetResponseStream();
$rdr = New-Object -TypeName "System.IO.StreamReader" -ArgumentList ($str);
$content = $rdr.ReadToEnd();
$str.Close();
$str.Dispose();
$rdr.Dispose();
# Build a return object that's similar to a Microsoft.PowerShell.Commands.HtmlWebResponseObject
$ret = New-Object -TypeName "System.Object";
$ret | Add-Member -Type NoteProperty -Name "BaseResponse" -Value $res;
$ret | Add-Member -Type NoteProperty -Name "Content" -Value $content;
$ret | Add-Member -Type NoteProperty -Name "StatusCode" -Value ([int] $res.StatusCode);
$ret | Add-Member -Type NoteProperty -Name "StatusDescription" -Value $res.StatusDescription;
return $ret;
}
}
The $body parameter is made like this:
$form=#()
$form+= ,#("value1",'somevalue')
$form+=,#("value2", 'somevalue')
$form+=,#("value2",'somevalue')
$form+=,#("value3",'somevalue')
Everything looks good now. It still doesn't work, but my original version with unique keys also doesn't work anymore, so there's probably something else going wrong.

Powershell matching TWO values in an array/object

I'll explain what I am trying to achieve first in case there is a better way than what I have wrote. I am trying to get a list of users (but in below example I am only querying one user to test the script) who have an Exchange plan set to disabled.
The filter I need to apply is on the licenses.servicestatus object. If you run and output of this object your get:
ServicePlan ProvisioningStatus
----------- ------------------
INTUNE_O365 PendingActivation
YAMMER_ENTERPRISE PendingInput
RMS_S_ENTERPRISE Success
OFFICESUBSCRIPTION Success
MCOSTANDARD Disabled
SHAREPOINTWAC Disabled
SHAREPOINTENTERPRISE Disabled
EXCHANGE_S_ENTERPRISE Success
What I need is the query to return true if it finds "disabled" in the provisioningstatus column and a matching "exchange" wildcard in the serviceplan column.
My script below does not do this, instead it returns true if it finds disabled and exchange in ANY order, IE it will always return true as long as disabled and Exchange are anywhere in the table, not where they both match on one row. This is as close as I can get as to what I want.
Get-MsolUser -UserPrincipalName "exampleuser#dom.com"| ? {"disabled" -in $_.licenses.servicestatus.provisioningstatus -and ($_.licenses.servicestatus| Out-String| ? {$_ -like "*exchange*"})}
I can see where I am going wrong, I just don't know how to fix it. The script is effectively running two separate searches rather than combining them together.
Also Note the reason I am using out-string is because the table above does not output serviceplan as a string.
If there is a better way of doing this then please advise otherwise I just need to know how to match two conditions in an array from the same row.
Get-MsolUser -UserPrincipalName "exampleuser#dom.com" |
ForEach-Object {
if( ($_.licenses.serviceplan.tostring() -match 'Exchange') -and ($_.licenses.ProvisioningStatus -eq 'Disabled') )
{
$true
}
Else
{
$false
}
}
examining your code :
"disabled" -in $_.licenses.servicestatus.provisioningstatus
wont work because
$_.licenses is an object with 2 properties Servicestatus & Provisioningstatus
so you can either use $_.licenses.servicestatus or $_.licenses.provisioningstatus not both together like $_.licenses.servicestatus.provisioningstatus because there is no such property.
Also -in is used to check if a value is contained in an array not suitable for what you are doing.
Your question got me to think about using Test-Any which i read about in an article written by #JaredPar. The basic idea is to evaluate if any item in an array of objects have a set of matching conditions.
I have put it into a module like this.
function Test-Any {
[CmdletBinding()]
param([scriptblock]$EvaluateCondition,
[Parameter(ValueFromPipeline = $true)] $ObjectToTest)
begin {
$any = $false
}
process {
if (-not $any -and (& $EvaluateCondition $ObjectToTest)) {
$any = $true
}
}
end {
$any
}
}
function Test-All {
[CmdletBinding()]
param([scriptblock]$EvaluateCondition,
[Parameter(ValueFromPipeline = $true)] $ObjectToTest)
begin {
$all = $true
}
process {
if ($all -and ((& $EvaluateCondition $ObjectToTest) -eq $false)) {
$all = $false
}
}
end {
$all
}
}
Export-ModuleMember -Function Test-Any, Test-All
Now as you might have noticed there is also a Test-All function. This is not used for this sample but may come in handy.
Now you can solve your task like this.
Notice i have replaced the call to Get-msoluser with some proper test data.
Import-Module AllAny
$testdata = #(
(new-object psobject -Property #{ServicePlan="ExchangePlan";licenses = new-object psobject -Property #{ProvisioningStatus="Disabled"}}),
(new-object psobject -Property #{ServicePlan="SomeOtherPlan";licenses = new-object psobject -Property #{ProvisioningStatus="Enabled"}}))
$userProp = $testdata #Get-MsolUser -UserPrincipalName "exampleuser#dom.com"
if ($userProp | Test-Any {$Args.serviceplan -match "Exchange" -and $Args.licenses.ProvisioningStatus -eq 'Disabled'})
{
echo "Do your thing!"
}
Hope that it makes sense.
I managed to fix this myself:
Get-MsolUser -UserPrincipalName "exampleuser#dom.com" | ? {$_.licenses.servicestatus| Out-String | ? {$_ -like "*exchange*disabled*"}}
This is fairly old, but here's a slightly neater solution I came up with, given the limitations of Azure queries.
First, create a list of the users that at least include the criteria you need to match on
$users = Get-Msoluser -EnabledFilter EnabledOnly |
Where { ($_.licenses.serviceplan.tostring() -match 'Exchange') `
-and ($_.licenses.ProvisioningStatus -eq 'Disabled') }
What you get is users that have both "Exchange" and "Disabled" somewhere within their Licenses attribute, but they may not be on the same row.
Just be cautious if you are looking for "unlicensed" users, because licenses can be reassigned. Here I'm using Get-AzureADUser and the AssignedPlans property instead. This user has been licensed for SfB twice, but one is still valid.
AssignedTimestamp CapabilityStatus Service ServicePlanId
----------------- ---------------- ------- -------------
2019-12-05 03:46:34 Enabled MicrosoftCommunicationsOnline 3e26ee1f-8a5f-4d52-aee2-b81ce45c8f40
2019-09-26 07:16:48 Deleted MicrosoftCommunicationsOnline 4828c8ec-dc2e-4779-b502-87ac9ce28ab7
After doing the first pass to populate the $users list, to get users where you have at least row that has an Exchange & Disabled value, check each user's Licenses attribute with a Where statement on both properties. The following dumps the UPN into $licensedUsers for later export.
$licensedUsers = #()
$users | Foreach {
$u = $_
if ($u.licenses | where { ($_.serviceplan.tostring() -match 'Exchange') `
-and ($_.ProvisioningStatus -eq 'Disabled') }) {
$licensedUsers += $u.userPrincipalName
#if you want more properties in the report, create a PSCustomObject here instead
}
}
If you only wanted to get the users that don't have any valid Exchange licenses at all, you'd want to reverse the logic to find accounts where all the licences are not enabled.
if (-not ($u.licenses | where { ($_.serviceplan.tostring() -match 'Exchange') `
-and ($_.ProvisioningStatus -eq 'Enabled')) })
There are a couple of things to do this in one line (as far as I can tell):
Use nested Where-Objects to check each object down the tree
No need to convert ServicePlan to a string if you use the 'servicename' property underneath it
So I think this should meet the original posters' requirements in a single command:
Get-MsolUser -UserPrincipalName "exampleuser#dom.com" | Where-Object { $_.Licenses.ServiceStatus | Where-Object { $_.ServicePlan.ServiceName -like "*exchange*" -and $_.ProvisioningStatus -eq "Disabled" } }
Or for a shorter command:
Get-MsolUser -UserPrincipalName "exampleuser#dom.com" | ? { $_.Licenses.ServiceStatus | ? { $_.ServicePlan.ServiceName -like "*exchange*" -and $_.ProvisioningStatus -eq "Disabled" } }

Powershell -replace regex not working on connection strings

I'm attempting to use the Powershell -replace command to update the data source in my config file. However, the -replace regex below will not remove the $oldServer value.
I've place a string directly in to the $_.connectionString variable in the loop and it saved properly, so I know that is not the issue. Seems to just be the regex.
#environment variables
$env = "DEV"
$oldServer = "quasq10"
$newValue = "$env-AR-SQL.CORP.COM"
$doc = [xml](Get-Content "D:\AMS\app.config")
$doc.configuration.connectionStrings.add|%{
$_.connectionString = $_.connectionString -replace $oldServer, $newValue;
}
$doc.Save($file.FullName)
EDIT
Per the comment below I added a Write-host $_.connectionString statement as the first line in the loop. Below is the console output
metadata=res:///MonetDb.csdl|res:///MonetDb.ssdl|res://*/MonetDb.msl;provider=System.Data.SqlClient;provider connection string="data source=quasq10\sql08a;initial catalog=MyDB
;integrated security=True;multipleactiveresultsets=True;App=EntityFramework"
I just put this right into ISE, I copied your connection string into a variable and was able to do this replace as a one off.
$connectionString = 'metadata=res:///MonetDb.csdl|res:///MonetDb.ssdl|res://*/MonetDb.msl;provider=System.Data.SqlClient;provider connection string="data source=quasq10\sql08a;initial catalog=MyDB ;integrated security=True;multipleactiveresultsets=True;App=EntityFramework"'
$env = "DEV"
$oldServer = "quasq10"
$newValue = "$env-AR-SQL.CORP.COM"
$connectionString -replace $oldServer, $newValue
res:///MonetDb.csdl|res:///MonetDb.ssdl|res://*/MonetDb.msl;provider=System.Data.SqlClient;provider connection string="data source=DEV-AR-SQL.CORP.COM\sql08a;initial catalog=MyDB ;integrated security=True;multipleactiveresultsets=True;App=EntityFramework"
I think your foreach loop might not be getting the info you want, because it looks like your replace is fine.
$doc.configuration.connectionStrings.add
I haven't done much with XML, does the XML data type have an ADD member function? You aren't really adding anything, right?
As a test, what do you get from this:
$doc.configuration.connectionStrings | % {
$_.connectionString -replace $oldServer, $newValue;
}
Run that against a dummy file and see what happens.
For a sanity check on the replace operator:
$string = "The quick brown fox jumped over the lazy dog"
$oldColor = "brown"
$newColor = "orange"
$string -replace $oldColor, $newColor
To avoid digging through comments, this method worked
$string.Replace($oldColor,$newColor)

Perl taint mode with domain name input for CGI resulting in “Insecure dependency in eval”

Given the following in a CGI script with Perl and taint mode I have not been able to get past the following.
tail /etc/httpd/logs/error_log
/usr/local/share/perl5/Net/DNS/Dig.pm line 906 (#1)
(F) You tried to do something that the tainting mechanism didn't like.
The tainting mechanism is turned on when you're running setuid or
setgid, or when you specify -T to turn it on explicitly. The
tainting mechanism labels all data that's derived directly or indirectly
from the user, who is considered to be unworthy of your trust. If any
such data is used in a "dangerous" operation, you get this error. See
perlsec for more information.
[Mon Jan 6 16:24:21 2014] dig.cgi: Insecure dependency in eval while running with -T switch at /usr/local/share/perl5/Net/DNS/Dig.pm line 906.
Code:
#!/usr/bin/perl -wT
use warnings;
use strict;
use IO::Socket::INET;
use Net::DNS::Dig;
use CGI;
$ENV{"PATH"} = ""; # Latest attempted fix
my $q = CGI->new;
my $domain = $q->param('domain');
if ( $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/ ) {
$domain = "$1\.$2";
}
else {
warn("TAINTED DATA SENT BY $ENV{'REMOTE_ADDR'}: $domain: $!");
$domain = ""; # successful match did not occur
}
my $dig = new Net::DNS::Dig(
Timeout => 15, # default
Class => 'IN', # default
PeerAddr => $domain,
PeerPort => 53, # default
Proto => 'UDP', # default
Recursion => 1, # default
);
my #result = $dig->for( $domain, 'NS' )->to_text->rdata();
#result = sort #result;
print #result;
I normally use Data::Validate::Domain to do checking for a “valid” domain name, but could not deploy it in a way in which the tainted variable error would not occur.
I read that in order to untaint a variable you have to pass it through a regex with capture groups and then join the capture groups to sanitize it. So I deployed $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/. As shown here it is not the best regex for the purpose of untainting a domain name and covering all possible domains but it meets my needs. Unfortunately my script is still producing tainted failures and I can not figure out how.
Regexp-Common does not provide a domain regex and modules don’t seem to work with untainting variable so I am at a loss now.
How to get this thing to pass taint checking?
$domain is not tainted
I verified that your $domain is not tainted. This is the only variable you use that could be tainted, in my opinion.
perl -T <(cat <<'EOF'
use Scalar::Util qw(tainted);
sub p_t($) {
if (tainted $_[0]) {
print "Tainted\n";
} else {
print "Not tainted\n";
}
}
my $domain = shift;
p_t($domain);
if ($domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/) {
$domain = "$1\.$2";
} else {
warn("$domain\n");
$domain = "";
}
p_t($domain);
EOF
) abc.def
It prints
Tainted
Not tainted
What Net::DNS::Dig does
See Net::DNS::Dig line 906. It is the beginning of to_text method.
sub to_text {
my $self = shift;
my $d = Data::Dumper->new([$self],['tobj']);
$d->Purity(1)->Deepcopy(1)->Indent(1);
my $tobj;
eval $d->Dump; # line 906
…
From new definition I know that $self is just hashref containing values from new parameters and several other filled in the constructor. The evaled code produced by $d->Dump is setting $tobj to a deep copy of $self (Deepcopy(1)), with correctly set self-references (Purity(1)) and basic pretty-printing (Indent(1)).
Where is the problem, how to debug
From what I found out about &Net::DNS::Dig::to_text, it is clear that the problem is at least one tainted item inside $self. So you have a straightforward way to debug your problem further: after constructing the $dig object in your script, check which of its items is tainted. You can dump the whole structure to stdout using print Data::Dumper::Dump($dig);, which is roughly the same as the evaled code, and check suspicious items using &Scalar::Util::tainted.
I have no idea how far this is from making Net::DNS::Dig work in taint mode. I do not use it, I was just curious and wanted to find out, where the problem is. As you managed to solve your problem otherwise, I leave it at this stage, allowing others to continue debugging the issue.
As resolution to this question if anyone comes across it in the future it was indeed the module I was using which caused the taint checks to fail. Teaching me an important lesson on trusting modules in a CGI environment. I switched to Net::DNS as I figured it would not encounter this issue and sure enough it does not. My code is provided below for reference in case anyone wants to accomplish the same thing I set out to do which is: locate the nameservers defined for a domain within its own zone file.
#!/usr/bin/perl -wT
use warnings;
use strict;
use IO::Socket::INET;
use Net::DNS;
use CGI;
$ENV{"PATH"} = ""; // Latest attempted fix
my $q = CGI->new;
my $domain = $q->param('domain');
my #result;
if ( $domain =~ /(^\w+)\.(\w+\.?\w+\.?\w+)$/ ) {
$domain = "$1\.$2";
}
else {
warn("TAINTED DATA SENT BY $ENV{'REMOTE_ADDR'}: $domain: $!");
$domain = ""; # successful match did not occur
}
my $ip = inet_ntoa(inet_aton($domain));
my $res = Net::DNS::Resolver->new(
nameservers => [($ip)],
);
my $query = $res->query($domain, "NS");
if ($query) {
foreach my $rr (grep { $_->type eq 'NS' } $query->answer) {
push(#result, $rr->nsdname);
}
}
else {
warn "query failed: ", $res->errorstring, "\n";
}
#result = sort #result;
print #result;
Thanks for the comments assisting me in this matter, and SO for teaching more then any other resource I have come across.