I have some text content and would like to split in more friendly view and later export to CSV format. I want to replace the first couple of spaces with tab. I tried something with regex pattern \s, but it split all text.
You may see sample data and my results
This should do the trick:
$sourceFilePath = 'c:\infile.txt'
$destFilePath = 'c:\outfile.txt'
$writeHandle = [System.IO.File]::OpenWrite( $destFilePath )
foreach($line in [System.IO.File]::ReadLines($sourceFilePath))
{
$outbuf = [byte[]][char[]](($line -replace '^(.*?) (.*?) (.*?) (.*?) (.*)$', '$1*$2*$3*$4*$5').Replace("*", "`t") + [environment]::NewLine)
[void]$writeHandle.Write( $outbuf, 0, $outbuf.Length )
}
[void]$writeHandle.Close()
You can use something like this
$inputtext = Get-Content 'EQ-Input.txt'
$outputobject = foreach ($Line in $inputtext) {
$arr = $line -split ' '
[pscustomobject]#{
Date = $arr[0]
Time = $arr[1]
Code = $arr[2]
Result = $arr[3..($arr.Length-1)] -join ' '
}
}
Then you can use $outputobject for further analysis or you can convert it or save it as CSV.
$outputobject | ConvertTo-Csv
$outputobject | Export-Csv -Path 'EQ-Output.csv'
Related
I have following beggining of a Powershell script in which I would like to replace the values of variables for different enviroment.
$SomeVar1 = "C:\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewriten
$SomeVar2 = "C:\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
When I run the rewrite script, the result should look like this:
$SomeVar1 = "F:\different\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewrite
$SomeVar2 = "F:\different\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
Current script that does(n't) rewrite:
$arr = #(
[PSCustomObject]#{Regex = '$SomeVar1 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\a"'}
[PSCustomObject]#{Regex = '$SomeVar2 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\b"'}
)
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
# Get full line for replacement
$Line = Get-Content $Workfile | Select-String $ArrRegex | Select-Object -First 1 -ExpandProperty Line
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile
$Content -replace "^$Line",$ArrReplace | Set-Content $Workfile
}
This replaces all the occurences in file on the start of the line (and I need only the 1st one) and doest not replace the one in Note which is okay.
Then I found this Powershell: Replace last occurence of a line in a file which does the exact oposite of what I need, only rewrites the last occurence of the string and it does it in the Note aswell and I would somehow like to change it to do the opposite - 1st occurence, line begining (Wont target the Note)
Code in my case looks like this:
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile -Raw
$Line = "(?s)(.*)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line,$ArrReplace | Set-Content $Workfile
Do you have any recommendations on how to archive my goal, or is there a more sothisticated way to replace variables for powershell scripts like this?
Thanks in advance.
So I finally figured it out, I had to add Select-String "^$ArrRegex" during $Line creation to exclude any string that were on on line beggining and then use this Regex to do the job: ^(?s)(.*?\n)$Line
In my case it does the following: Only selects 1st occurnece on the beggining of the line and replaces it. It ignores everything else and when re-run, does not rewrite others. The copies of vars will not really exist in final version and will be set once like $Var1 = "Value" and never changed during script, but I wanted to be sure that I won't replace something by mistake.
The final replacing part does look like this:
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
$Line = Get-Content $Workfile | Select-String "^$ArrRegex" | Select-Object -First 1 -ExpandProperty Line
$Line = [Regex]::Escape($Line)
$Line = "^(?s)(.*?\n)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line, $ArrReplace | Set-Content $Workfile
}
You could possibly use flag variables like below to only do the first replacement for each of your regex patterns.
$Altered = Get-Content -Path $Workfile |
Foreach-Object {
if(-not $a) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX1','YOUR_REPLACEMENT1'
if($_ -match 'YOUR_REPLACEMENT1') { $a = 'replacement done' } #Set Flag
}
if(-not $b) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX2','YOUR_REPLACEMENT2'
if($_ -match 'YOUR_REPLACEMENT2') { $b = 'replacement done' } #Set Flag
}
$_ # Pipe back to $Altered
}
$Altered | Set-Content -Path $WorkFile
Just reverse the RegEx, if that is what you are after:
Clear-Host
#'
abc
abc
abc
'# -replace '^(.*?)\babc\b', '$1HelloWorld'
# Results
<#
HelloWorld
abc
abc
#>
I have been struggling to successfully break apart contents of a text file and insert them into a .csv with the following rules:
The line containing '>' should be inserted into .csv column 1
The lines containing all caps should be inserted into .csv column 2 and each block of capital letters should be joined (have its `r or `n removed)
'>' and '*' should be removed where present
Separately, I can get column 1 to work fairly well using:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]#{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
and limited success using similar for column 2 (I can't seem to get -join to work with `r or `n):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]#{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
But it escapes me how to get both to work in the same code block to iterate over each section delimited by '>' and/or '*'.
Below is a sample of the data for reference.
>9392290|2983921
FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA
GDFOUYUIOAGHIHUAGSD
>lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1
FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW
DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS
FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS
DFSAQSAHUSDFAHOUHGROUGRWE*
>jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31
ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI
GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA
AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO
FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP
GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG
GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA
GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW
EGWIOJNENIOPIOWINPEAWNPOI*
I suggest using a -split operation:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] #{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display output:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…
I'm currently trying to match a pattern of IDs and replace with 0 or 1.
example pc0045601234 replace with 1234 the last 4 and add the 3rd digit in front "01234"
I tried the code below but the out only filled the userid column with No matching employee
$reportPath = '.\report.csv'`$reportPath = '.\report.csv'`
$csvPath = '.\output.csv'
$data = Import-Csv -Path $reportPath
$output = #()
foreach ($row in $data) {
$table = "" | Select ID,FirstName,LastName,userid
$table.ID = $row.ID
$table.FirstName = $row.FirstName
$table.LastName = $row.LastName
switch -Wildcard ($row.ID)
{
{$row.ID -match 'P\d\d\d\d\d\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID -match 'SEC\d\d\d\D\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID.StartsWith("P005700477")} {$table.userid = $row.ID -replace "P005700477","0477"; continue}
{$row.ID.StartsWith("P00570")} {$table.userid = $row.ID -replace "P00570","0"; continue}
default {$table.userid = "No Matching Employee"}
}
$output += $table
}
$output | Export-csv -NoTypeInformation -Path $csvPath
Here are three different ways to achieve the desired result. The first two use the same technique, just written in a different way.
First we put the sample data in a variable as a multiline string array. This is the equivalent as $text = Get-Content $somefile
$text = #'
PC05601234
PC15601234
'# -split [environment]::NewLine
Option 1 # convert to character array, select the 3rd and last 4 digits.
$text | foreach {-join ($_.ToChararray()| select -Skip 2 -First 1 -Last 4)}
Option 2 # same as above, requiring an extra -join to avoid spaces.
$text | foreach {(-join $_.ToChararray()| foreach{$_[2]+(-join $_[-4..-1])})}
Option 3 # my preference, regex. Capture the desired digits and replace the entire string with those two captured values.
$text -replace '^\D+(?!=\d)(\d)\w+([\d]{4}$)','$1$2'
All of these output
01234
11234
Further testing with different char/digit combinations and lengths.
$text = #'
PC05601234
PC15601234
PC0ABC124321
PC1DE4321
PC0A5678
PC1ABCD215678
'# -split [environment]::NewLine
Running the new sample data through each option all produce this output
01234
11234
04321
14321
05678
15678
I want to add " after third comma and " before fifth comma. How can this can be done in powershell ?
My idea is to use regex function to find the location of the third and fifth comma then add " to them by
$s.Insert(4,'-') **In case reg return position 4
example data
04642583,3,HC Mobile,O213,Inc,SIS Services,KR,Non Payroll Relevant,KR50
Output
04642583,3,HC Mobile,"O213,Inc",SIS Services,KR,Non Payroll Relevant,KR50
This is code I tried, but it failed by 'An empty pipe element is not allowed' How to fix it
$source = "D:\Output\MoreComma.csv"
$FinalFile = "D:\Output\MoreComma_Corrected.csv"
$content = Get-Content $source
foreach ($line in $content)
{
$items = $line.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
} | Set-Content $FinalFile
If you know the format (e.g you know that it's always in this comma-separated fashion); and your're only trying to achieve this; you can simply just split the line, add the quotes and join the line again.
Example:
$data = "04642583,3,HC Mobile,O213,Inc,SIS Services,KR,Non Payroll Relevant,KR50";
$items = $data.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
This will produce the line:
04642583,3,HC Mobile,"O213,Inc",SIS Services,KR,Non Payroll Relevant,KR50
Given you've stored this in a CSV- file:
$file = "C:\tmp\test.csv";
$lines = (get-content $file);
$newLines=($lines|foreach-object {
$items = $_.split(',');
$items[3] = '"'+$items[3]
$items[4] = $items[4]+'"';
$items -join ','
})
You can then output the result in a new file if you want
$newLines|Set-content C:\tmp\test2.csv
This will "mess" up your CSV-format file though (as it will considered to "merge the columns"), but I'm guessing this is what you're trying to achieve?
I've followed the excellent solution in this article:
PowerShell multiple string replacement efficiency
to try and normalize telephone numbers imported from Active Directory. Here is an example:
$telephoneNumbers = #(
'+61 2 90237534',
'04 2356 3713'
'(02) 4275 7954'
'61 (0) 3 9635 7899'
'+65 6535 1943'
)
# Build hashtable of search and replace values.
$replacements = #{
' ' = ''
'(0)' = ''
'+61' = '0'
'(02)' = '02'
'+65' = '001165'
'61 (0)' = '0'
}
# Join all (escaped) keys from the hashtable into one regular expression.
[regex]$r = #($replacements.Keys | foreach { [regex]::Escape( $_ ) }) -join '|'
[scriptblock]$matchEval = { param( [Text.RegularExpressions.Match]$matchInfo )
# Return replacement value for each matched value.
$matchedValue = $matchInfo.Groups[0].Value
$replacements[$matchedValue]
}
# Perform replace over every line in the file and append to log.
$telephoneNumbers |
foreach {$r.Replace($_,$matchEval)}
I'm having problems with the formatting of the match expressions in the $replacements hashtable. For example, I would like to match all +61 numbers and replace with 0, and match all other + numbers and replace with 0011.
I've tried the following regular expressions but they don't seem to match:
'^+61'
'^+[^61]'
What am I doing wrong? I've tried using \ as an escape character.
I've done some re-arrangement of this, I'm not sure if it works for your whole situation but it gives the right results for the example.
I think the key is not to try and create one big regex from the hashtable, but rather to loop over it and check the values in it against the telephone numbers.
The only other change I made was moving the ' ','' replacement from the hash into the code that prints the replacement phone number, as you want this to run in every scenario.
Code is below:
$telephoneNumbers = #(
'+61 2 90237534',
'04 2356 3713'
'(02) 4275 7954'
'61 (0) 3 9635 7899'
'+65 6535 1943'
)
$replacements = #{
'(0)' = ''
'+61' = '0'
'(02)' = '02'
'+65' = '001165'
}
foreach ($t in $telephoneNumbers) {
$m = $false
foreach($r in $replacements.getEnumerator()) {
if ( $t -match [regex]::Escape($r.key) ) {
$m = $true
$t -replace [regex]::Escape($r.key), $r.value -replace ' ', '' | write-output
}
}
if (!$m) { $t -replace ' ', '' | write-output }
}
Gives:
0290237534
0423563713
0242757954
61396357899
00116565351943