PowerShell -split on Pipe Character - regex

Consider the ASCII text file test1.txt:
a,b,c
d,e,f
And the following Powershell Script test1.ps1:
$input -split "`n" | ForEach-Object {
$row = $_ -split ","
$row[0]
}
The output is, as excpected:
a
d
However, if we change the separator to | everything fails as in test2.txt:
a|b|c
d|e|f
And the following Powershell Script test2.ps1:
$input -split "`n" | ForEach-Object {
$row = $_ -split "|"
$row[0]
}
The output is all but empty. Why does the -split fail?

It seems -split expects a regular expression and thus you need to escape the pipe as in:
$row = $_ -split "\|"
Or specify the SimpleMatch option to split on the literal string or character:
$row = $_ -split "|", 0, "SimpleMatch"
The 0 stands for MaxSubstrings: "The maximum number of substrings, by default all (0)."
Source: http://ss64.com/ps/split.html
Also: Get-Help about_Split

Related

Powershell: Replace only fist occurence of a line/string in entire file

I have following beggining of a Powershell script in which I would like to replace the values of variables for different enviroment.
$SomeVar1 = "C:\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewriten
$SomeVar2 = "C:\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
When I run the rewrite script, the result should look like this:
$SomeVar1 = "F:\different\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewrite
$SomeVar2 = "F:\different\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
Current script that does(n't) rewrite:
$arr = #(
[PSCustomObject]#{Regex = '$SomeVar1 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\a"'}
[PSCustomObject]#{Regex = '$SomeVar2 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\b"'}
)
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
# Get full line for replacement
$Line = Get-Content $Workfile | Select-String $ArrRegex | Select-Object -First 1 -ExpandProperty Line
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile
$Content -replace "^$Line",$ArrReplace | Set-Content $Workfile
}
This replaces all the occurences in file on the start of the line (and I need only the 1st one) and doest not replace the one in Note which is okay.
Then I found this Powershell: Replace last occurence of a line in a file which does the exact oposite of what I need, only rewrites the last occurence of the string and it does it in the Note aswell and I would somehow like to change it to do the opposite - 1st occurence, line begining (Wont target the Note)
Code in my case looks like this:
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile -Raw
$Line = "(?s)(.*)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line,$ArrReplace | Set-Content $Workfile
Do you have any recommendations on how to archive my goal, or is there a more sothisticated way to replace variables for powershell scripts like this?
Thanks in advance.
So I finally figured it out, I had to add Select-String "^$ArrRegex" during $Line creation to exclude any string that were on on line beggining and then use this Regex to do the job: ^(?s)(.*?\n)$Line
In my case it does the following: Only selects 1st occurnece on the beggining of the line and replaces it. It ignores everything else and when re-run, does not rewrite others. The copies of vars will not really exist in final version and will be set once like $Var1 = "Value" and never changed during script, but I wanted to be sure that I won't replace something by mistake.
The final replacing part does look like this:
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
$Line = Get-Content $Workfile | Select-String "^$ArrRegex" | Select-Object -First 1 -ExpandProperty Line
$Line = [Regex]::Escape($Line)
$Line = "^(?s)(.*?\n)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line, $ArrReplace | Set-Content $Workfile
}
You could possibly use flag variables like below to only do the first replacement for each of your regex patterns.
$Altered = Get-Content -Path $Workfile |
Foreach-Object {
if(-not $a) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX1','YOUR_REPLACEMENT1'
if($_ -match 'YOUR_REPLACEMENT1') { $a = 'replacement done' } #Set Flag
}
if(-not $b) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX2','YOUR_REPLACEMENT2'
if($_ -match 'YOUR_REPLACEMENT2') { $b = 'replacement done' } #Set Flag
}
$_ # Pipe back to $Altered
}
$Altered | Set-Content -Path $WorkFile
Just reverse the RegEx, if that is what you are after:
Clear-Host
#'
abc
abc
abc
'# -replace '^(.*?)\babc\b', '$1HelloWorld'
# Results
<#
HelloWorld
abc
abc
#>

PowerShell: Pattern Matching Text File Contents to Insert into .CSV

I have been struggling to successfully break apart contents of a text file and insert them into a .csv with the following rules:
The line containing '>' should be inserted into .csv column 1
The lines containing all caps should be inserted into .csv column 2 and each block of capital letters should be joined (have its `r or `n removed)
'>' and '*' should be removed where present
Separately, I can get column 1 to work fairly well using:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]#{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
and limited success using similar for column 2 (I can't seem to get -join to work with `r or `n):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]#{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
But it escapes me how to get both to work in the same code block to iterate over each section delimited by '>' and/or '*'.
Below is a sample of the data for reference.
>9392290|2983921
FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA
GDFOUYUIOAGHIHUAGSD
>lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1
FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW
DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS
FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS
DFSAQSAHUSDFAHOUHGROUGRWE*
>jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31
ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI
GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA
AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO
FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP
GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG
GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA
GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW
EGWIOJNENIOPIOWINPEAWNPOI*
I suggest using a -split operation:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] #{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display output:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…

PowerShell Regex with csv file

I'm currently trying to match a pattern of IDs and replace with 0 or 1.
example pc0045601234 replace with 1234 the last 4 and add the 3rd digit in front "01234"
I tried the code below but the out only filled the userid column with No matching employee
$reportPath = '.\report.csv'`$reportPath = '.\report.csv'`
$csvPath = '.\output.csv'
$data = Import-Csv -Path $reportPath
$output = #()
foreach ($row in $data) {
$table = "" | Select ID,FirstName,LastName,userid
$table.ID = $row.ID
$table.FirstName = $row.FirstName
$table.LastName = $row.LastName
switch -Wildcard ($row.ID)
{
{$row.ID -match 'P\d\d\d\d\d\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID -match 'SEC\d\d\d\D\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID.StartsWith("P005700477")} {$table.userid = $row.ID -replace "P005700477","0477"; continue}
{$row.ID.StartsWith("P00570")} {$table.userid = $row.ID -replace "P00570","0"; continue}
default {$table.userid = "No Matching Employee"}
}
$output += $table
}
$output | Export-csv -NoTypeInformation -Path $csvPath
Here are three different ways to achieve the desired result. The first two use the same technique, just written in a different way.
First we put the sample data in a variable as a multiline string array. This is the equivalent as $text = Get-Content $somefile
$text = #'
PC05601234
PC15601234
'# -split [environment]::NewLine
Option 1 # convert to character array, select the 3rd and last 4 digits.
$text | foreach {-join ($_.ToChararray()| select -Skip 2 -First 1 -Last 4)}
Option 2 # same as above, requiring an extra -join to avoid spaces.
$text | foreach {(-join $_.ToChararray()| foreach{$_[2]+(-join $_[-4..-1])})}
Option 3 # my preference, regex. Capture the desired digits and replace the entire string with those two captured values.
$text -replace '^\D+(?!=\d)(\d)\w+([\d]{4}$)','$1$2'
All of these output
01234
11234
Further testing with different char/digit combinations and lengths.
$text = #'
PC05601234
PC15601234
PC0ABC124321
PC1DE4321
PC0A5678
PC1ABCD215678
'# -split [environment]::NewLine
Running the new sample data through each option all produce this output
01234
11234
04321
14321
05678
15678

Regex replace using a substring of the regex result value

I've been reading a ton of material and thought I had found my solution but no luck. I need to find apostrophes contained in a name and then replace them with a double. I am loading a file to an array and then looping through that, looking for the apostrophes. The catch is that each row can have several apostrophes so that's why it's not a simple find and replace.
Here is a sample of the file:
create(xxxxxxx)using(xxxxxxx)name('O'Doe, John')
replace(xxxxxxx)instdata('ab 1234 ')
create(xxxxxxx)using(xxxxxxx)name('Doe, O'Jane')
replace(xxxxxxx)instdata('ab 5678 ')
There are other lines inbetween but they don't contain apostrophes.
Here is what I have so far:
$Pattern = "[A-Z]'[A-Z]"
$user = gc C:\Temp\mfnewuser.ins
for ($i = 0; $i -lt $user.count; $i++) {
if ($user[$i] -match $Pattern) {
$user[$i] = [regex]::replace($strText, $Pattern.substring(2,1), "''")
$user | out-file C:\Temp\mfnewuser.ins
}
}
I'm looking for a capital letter, followed by an apostrophe, followed by another capital. Because of the other commas, I can't just do a global replace. I know my pattern matching is working but I can't seem to manipulate it with the substring. The substring looks at $Pattern as a string instead of the result of a regex. If I can save the regex result to a variable, that would be great. I think then the replace would be easy.
Tried this as well but no luck either:
$Pattern = "[A-Z]'[A-Z]"
$NewPattern = "[A-Z]''[A-Z]"
$f = Get-Content C:\Temp\mfnewuser.ins
$f = $f -replace $Pattern, $NewPattern
$f | out-file C:\Temp\mfnewuser.ins
I may be approaching this all wrong and there is an easier way but I haven't seen anything yet.
EDIT:
Based on Bill_Stewarts example below, I've got this to work on the First Name but not yet the Last Name:
$Pattern = "[A-Z]'[A-Z]"
$user = gc C:\Temp\mfnewuser.ins
for ($i = 0; $i -lt $user.count; $i++) {
if ($user[$i] -match $Pattern) {
$user[$i] = $user[$i] -replace "(.*[A-Z])'([A-Z]+.*)", "`$1''`$2"
$user | out-file C:\Temp\mfnewuser.ins
}
}
Perhaps something like this?
get-content "test.txt" | foreach-object {
$_ -replace "([A-Z])'([A-Z])", "`$1''`$2"
}
Regular expressions can be grouped using ( ) and the -replace operator supports substring replacement ($1 and $2).
Replace your line, with the following.
$user[$i] = $user[$i] -replace "([A-Z])'([A-Z])", "`$1`''`$2"
Or try one of the following. This should suffice.
get-content "mfnewuser.ins" | foreach-object {
$_ -replace "([A-Z])'([A-Z])", "`$1`''`$2"
} | set-content "mfnewuser.ins"
...
get-content "mfnewuser.ins" | foreach-object {
$_ -replace "([a-zA-Z', ]+)'([a-zA-Z', ]+)", "`$1`''`$2"
} | set-content "mfnewuser.ins"

powershell regular expressions

quick one:
$logFile="D:\Code\functest\1725.log"
function getTime($pattern) {
Get-Content $logFile | %{ if ($_.Split('\t') -match $pattern) {$_} }
}
getTime("code")
gives me
simple 17-Feb-2011 10:45:27 Updating source code to revision: 49285
simple 17-Feb-2011 10:54:22 Updated source code to revision: 49285
but if I change the print value from
$_
to
$matches
I get nothing. I thought this array should have been created automatically? probably something silly, but this is my first day of using powershell :-)
EDIT: what I want to return is
Get-Date (column 2 of the matching line)
Your call to Split() is using C# conventions to escape the t to specify a tab character. In PowerShell, you use a single backtick e.g. $_.Split("`t"). Also -match is behaving a bit differently on an array like this so have it operate on each individual string like so:
Get-Content $logFile | Foreach {$_.Split("`t")} | Where { $_ -match $pattern }
There's also a sort of hidden trick here with Get-Content where you can get it to split for you:
Get-Content $logFile -del "`t" | Where { $_ -match $pattern }
Update: based on the updated question, try something like this:
gc $logFile | % {$cols = $_.Split("`t"); if ($cols[2] -match $pattern) {$cols[1]}}
Keeping in mind that arrays are 0-based in PowerShell. If the text is already in a DateTime format that PowerShell/.NET understand, you can just cast it to a DateTime like so [DateTime]$cols[1].
The $_.Split('\t') is breaking it.
First, it's breaking on every letter "t", not at tabs.
Second, it return a array that confounds -match.
With the following code:
Get-Content $logFile | %{ if ($_ -match $pattern) { $matches } }
getTime("code") would return:
Name Value
---- -----
0 code
0 code
This would allow to search with regular expressions, as in
$answerArray = getTime("(\t)(\d+)")
$digitsOfSecondResult = $answerArray[1][2]
Write-Output $digitsOfSecondResult
If you just want to print the lines that match the pattern, try:
Get-Content $logFile | %{ if ($_ -match $pattern) { $_} }
To get the date:
function getTime($pattern) {
Get-Content $logFile | %{ if ($_ -match $pattern) { Get-Date $matches[1] } }
}
getTime("`t(.+)`t.*code")
Or:
function getTime($pattern) {
Get-Content $logFile | %{ if ($_ -match "`t(.+)`t.*$pattern") { Get-Date $matches[1] } }
}
getTime("code")