Powershell regex replacement expressions

Powershell regex replacement expressions - regex

I've followed the excellent solution in this article:
PowerShell multiple string replacement efficiency
to try and normalize telephone numbers imported from Active Directory. Here is an example:
$telephoneNumbers = #(
'+61 2 90237534',
'04 2356 3713'
'(02) 4275 7954'
'61 (0) 3 9635 7899'
'+65 6535 1943'
)
# Build hashtable of search and replace values.
$replacements = #{
' ' = ''
'(0)' = ''
'+61' = '0'
'(02)' = '02'
'+65' = '001165'
'61 (0)' = '0'
}
# Join all (escaped) keys from the hashtable into one regular expression.
[regex]$r = #($replacements.Keys | foreach { [regex]::Escape( $_ ) }) -join '|'
[scriptblock]$matchEval = { param( [Text.RegularExpressions.Match]$matchInfo )
# Return replacement value for each matched value.
$matchedValue = $matchInfo.Groups[0].Value
$replacements[$matchedValue]
}
# Perform replace over every line in the file and append to log.
$telephoneNumbers |
foreach {$r.Replace($_,$matchEval)}
I'm having problems with the formatting of the match expressions in the $replacements hashtable. For example, I would like to match all +61 numbers and replace with 0, and match all other + numbers and replace with 0011.
I've tried the following regular expressions but they don't seem to match:
'^+61'
'^+[^61]'
What am I doing wrong? I've tried using \ as an escape character.

I've done some re-arrangement of this, I'm not sure if it works for your whole situation but it gives the right results for the example.
I think the key is not to try and create one big regex from the hashtable, but rather to loop over it and check the values in it against the telephone numbers.
The only other change I made was moving the ' ','' replacement from the hash into the code that prints the replacement phone number, as you want this to run in every scenario.
Code is below:
$telephoneNumbers = #(
'+61 2 90237534',
'04 2356 3713'
'(02) 4275 7954'
'61 (0) 3 9635 7899'
'+65 6535 1943'
)
$replacements = #{
'(0)' = ''
'+61' = '0'
'(02)' = '02'
'+65' = '001165'
}
foreach ($t in $telephoneNumbers) {
$m = $false
foreach($r in $replacements.getEnumerator()) {
if ( $t -match [regex]::Escape($r.key) ) {
$m = $true
$t -replace [regex]::Escape($r.key), $r.value -replace ' ', '' | write-output
}
}
if (!$m) { $t -replace ' ', '' | write-output }
}
Gives:
0290237534
0423563713
0242757954
61396357899
00116565351943

Related

I want to split a string from : to \n in Powershell script

I am using a config file that contains some information as shown below.
User1:xyz#gmail.com
User1_Role:Admin
NAME:sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
ID:xyz#abc-demo-test-abc-mssql
Password:rewrfsdv34354*fds*vdfg435434
I want to split each value from*: to newline* in my Powershell script.
I am using -split '[: \n]' it matches perfectly until there is no '' in the value. If there is an '*' it will fetch till that. For example, for Password, it matches only rewrfsdv34354. Here is my code:
$i = 0
foreach ($keyOrValue in $Contents -split '[: *\n]') {
if ($i++ % 2 -eq 0) {
$varName = $keyOrValue
}
else {
Set-Variable $varName $keyOrValue
}
}
I need to match all the chars after : to \n. Please share your ideas.

It's probably best to perform two separate splits here, it makes things easier to work out if the code is going wrong for some reason, although the $i % 2 -eq 0 part is a neat way to pick up key/value.
I would go for this:
# Split the Contents variable by newline first
foreach ($line in $Contents -split '[\n]') {
# Now split each line by colon
$keyOrValue = $line -split ':'
# Then set the variables based on the parts of the colon-split
Set-Variable $keyOrValue[0] $keyOrValue[1]
}

You could also convert to a hashmap and go from there, e.g.:
$h = #{}
gc config.txt | % { $key, $value = $_ -split ' *: *'; $h[$key] = $value }
Or with ConvertFrom-StringData:
$h = (gc -raw dims.txt) -replace ':','=' | ConvertFrom-StringData
Now you have convenient access to keys and values, e.g.:
$h
Output:
Name Value
---- -----
Password rewrfsdv34354*fds*vdfg435434
User1 xyz#gmail.com
ID xyz#abc-demo-test-abc-mssql
NAME sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
User1_Role Admin
Or only keys:
$h.keys
Output:
Password
User1
ID
NAME
User1_Role
Or only values:
$h.values
Output:
rewrfsdv34354*fds*vdfg435434
xyz#gmail.com
xyz#abc-demo-test-abc-mssql
sdfdsfu4343-234324-ffsdf-34324d-dsfhdjhfd943
Admin
Or specific values:
$h['user1'] + ", " + $h['user1_role']
Output:
xyz#gmail.com, Admin
etc.

Getting first two strings between slashes

I have a string, alpha/beta/charlie/delta
I'm trying to extract out the string alpha/beta including the forward slash.
I'm able to accomplish this with split and joining the first and second result, but I feel like a regex might be better suited.
Depending on how many slashes there are as well will determine how many strings I need to grab, e.g. if there's 4 slashes get the first two strings, if there's 5, then grab first three. Again, my problem is extracting the slash with the string.

As Mathias already noticed - Split+Join is a perfectly valid solution:
$StringArray = #(
'alpha/beta/charlie/delta',
'alpha/beta/charlie/delta/omega'
'alpha/beta/charlie/gamma/delta/omega'
)
foreach ($String in $StringArray) {
$StringSplit = $String -split '/'
($StringSplit | Select-Object -First ($StringSplit.Count - 2) ) -join '/'
}

A little long, but I did it without regex:
$string = 'alpha/beta/charlie/delta/gamma'
# Count number of '/'
$count = 0
for( $i = 0; $i -lt $string.Length; $i++ ) {
if( $string[ $i ] -eq '/' ) {
$count = $count + 1
}
}
# Depending on the number of '/' you can create a mathematical equation, or simply do an if-else ladder.
# In this case, if count of '/' = 3, get first 2 strings, if count = 4, get first 3 strings.
function parse-strings {
Param (
$number_of_slashes,
$string
)
$all_slash = $number_of_slashes
$to_get = $number_of_slashes - 1
$counter = 0
for( $j = 0; $j -lt $string.Length; $j++ ) {
if( $string[ $j ] -eq '/' ) {
$counter = $counter + 1
}
if( $counter -eq $to_get ) {
( $string[ 0 .. ( $j - 1 ) ] -join "" )
break
}
}
}
parse-strings -number_of_slashes $count -string $string

You can try the .split() .net method where you define in parentheses where to split (on which character).
Then use the join operator “-join” to join your elements from the array
For your matter of concern use it like this:
$string = 'alpha/beta/charlie/delta/gamma'
$string = $string.split('/')
$string = "$($string[0])" + "/" + "$($string[1])"
$string
And so on...

Replace spaces with tabs

I have some text content and would like to split in more friendly view and later export to CSV format. I want to replace the first couple of spaces with tab. I tried something with regex pattern \s, but it split all text.
You may see sample data and my results

This should do the trick:
$sourceFilePath = 'c:\infile.txt'
$destFilePath = 'c:\outfile.txt'
$writeHandle = [System.IO.File]::OpenWrite( $destFilePath )
foreach($line in [System.IO.File]::ReadLines($sourceFilePath))
{
$outbuf = [byte[]][char[]](($line -replace '^(.*?) (.*?) (.*?) (.*?) (.*)$', '$1*$2*$3*$4*$5').Replace("*", "`t") + [environment]::NewLine)
[void]$writeHandle.Write( $outbuf, 0, $outbuf.Length )
}
[void]$writeHandle.Close()

You can use something like this
$inputtext = Get-Content 'EQ-Input.txt'
$outputobject = foreach ($Line in $inputtext) {
$arr = $line -split ' '
[pscustomobject]#{
Date = $arr[0]
Time = $arr[1]
Code = $arr[2]
Result = $arr[3..($arr.Length-1)] -join ' '
}
}
Then you can use $outputobject for further analysis or you can convert it or save it as CSV.
$outputobject | ConvertTo-Csv
$outputobject | Export-Csv -Path 'EQ-Output.csv'

Matching Something Against Array List Using Where Object

I've found multiple examples of what I'm trying here, but for some reason it's not working.
I have a list of regular expressions that I'm checking against a single value and I can't seem to get a match.
I'm attempting to match domains. e.g. gmail.com, yahoo.com, live.com, etc.
I am importing a csv to get the domains and have debugged this code to make sure the values are what I expect. e.g. "gmail.com"
Regular expression examples AKA $FinalWhiteListArray
(?i)gmail\.com
(?i)yahoo\.com
(?i)live\.com
Code
Function CheckDirectoryForCSVFilesToSearch {
$global:CSVFiles = Get-ChildItem $Global:Directory -recurse -Include *.csv | % {$_.FullName} #removed -recurse
}
Function ImportCSVReports {
Foreach ($CurrentChangeReport in $global:CSVFiles) {
$global:ImportedChangeReport = Import-csv $CurrentChangeReport
}
}
Function CreateWhiteListArrayNOREGEX {
$Global:FinalWhiteListArray = New-Object System.Collections.ArrayList
$WhiteListPath = $Global:ScriptRootDir + "\" + "WhiteList.txt"
$Global:FinalWhiteListArray= Get-Content $WhiteListPath
}
$Global:ScriptRootDir = Split-Path -Path $psISE.CurrentFile.FullPath
$Global:Directory = $Global:ScriptRootDir + "\" + "Reports to Search" + "\" #Where to search for CSV files
CheckDirectoryForCSVFilesToSearch
ImportCSVReports
CreateWhiteListArrayNOREGEX
Foreach ($Global:Change in $global:ImportedChangeReport){
If (-not ([string]::IsNullOrEmpty($Global:Change.Previous_Provider_Contact_Email))){
$pos = $Global:Change.Provider_Contact_Email.IndexOf("#")
$leftPart = $Global:Change.Provider_Contact_Email.Substring(0, $pos)
$Global:Domain = $Global:Change.Provider_Contact_Email.Substring($pos+1)
$results = $Global:FinalWhiteListArray | Where-Object { $_ -match $global:Domain}
}
}
Thanks in advance for any help with this.

the problem with your current code is that you put the regex on the left side of the -match operator. [grin] swap that and your code otta work.
taking into account what LotPings pointed out about case sensitivity and using a regex OR symbol to make one test per URL, here's a demo of some of that. the \b is for word boundaries, the | is the regex OR symbol. the $RegexURL_WhiteList section builds that regex pattern from the 1st array. if i haven't made something clear, please ask ...
$URL_WhiteList = #(
'gmail.com'
'yahoo.com'
'live.com'
)
$RegexURL_WhiteList = -join #('\b' ,(#($URL_WhiteList |
ForEach-Object {
[regex]::Escape($_)
}) -join '|\b'))
$NeedFiltering = #(
'example.com/this/that'
'GMail.com'
'gmailstuff.org/NothingElse'
'NotReallyYahoo.com'
'www.yahoo.com'
'SomewhereFarAway.net/maybe/not/yet'
'live.net'
'Live.com/other/another'
)
foreach ($NF_Item in $NeedFiltering)
{
if ($NF_Item -match $RegexURL_WhiteList)
{
'[ {0} ] matched one of the test URLs.' -f $NF_Item
}
}
output ...
[ GMail.com ] matched one of the test URLs.
[ www.yahoo.com ] matched one of the test URLs.
[ Live.com/other/another ] matched one of the test URLs.

Powershell to use regx to find character position

I want to add " after third comma and " before fifth comma. How can this can be done in powershell ?
My idea is to use regex function to find the location of the third and fifth comma then add " to them by
$s.Insert(4,'-') **In case reg return position 4
example data
04642583,3,HC Mobile,O213,Inc,SIS Services,KR,Non Payroll Relevant,KR50
Output
04642583,3,HC Mobile,"O213,Inc",SIS Services,KR,Non Payroll Relevant,KR50
This is code I tried, but it failed by 'An empty pipe element is not allowed' How to fix it
$source = "D:\Output\MoreComma.csv"
$FinalFile = "D:\Output\MoreComma_Corrected.csv"
$content = Get-Content $source
foreach ($line in $content)
{
$items = $line.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
} | Set-Content $FinalFile

If you know the format (e.g you know that it's always in this comma-separated fashion); and your're only trying to achieve this; you can simply just split the line, add the quotes and join the line again.
Example:
$data = "04642583,3,HC Mobile,O213,Inc,SIS Services,KR,Non Payroll Relevant,KR50";
$items = $data.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
This will produce the line:
04642583,3,HC Mobile,"O213,Inc",SIS Services,KR,Non Payroll Relevant,KR50
Given you've stored this in a CSV- file:
$file = "C:\tmp\test.csv";
$lines = (get-content $file);
$newLines=($lines|foreach-object {
$items = $_.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
})
You can then output the result in a new file if you want
$newLines|Set-content C:\tmp\test2.csv
This will "mess" up your CSV-format file though (as it will considered to "merge the columns"), but I'm guessing this is what you're trying to achieve?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Powershell regex replacement expressions - regex

Related

I want to split a string from : to \n in Powershell script

Getting first two strings between slashes

Replace spaces with tabs

Matching Something Against Array List Using Where Object

Powershell to use regx to find character position

Categories

Resources