How to use a conditional statement with regex in PowerShell? - regex

There are about ten lines of data. For each line of data I want to indicate whether that line contains numerals.
How can I print out "yes, this line has numerals" or "no, this line has no numerals" for each and every line, exactly once?
output:
thufir#dur:~/flwor/csv$
thufir#dur:~/flwor/csv$ pwsh import.ps1
no digits
Name
----
people…
thufir#dur:~/flwor/csv$
code:
$text = Get-Content -Raw ./people.csv
[array]::Reverse($text)
$tempAttributes = #()
$collectionOfPeople = #()
ForEach ($line in $text) {
if($line -notmatch '.*?[0-9].*?') {
$tempAttributes += $line
Write-Host "matches digits"
}
else {
Write-Host "no digits"
$newPerson = [PSCustomObject]#{
Name = $line
Attributes = $tempAttributes
}
$tempAttributes = #()
$collectionOfPeople += $newPerson
}
}
$collectionOfPeople
data:
people
joe
phone1
phone2
phone3
sue
cell4
home5
alice
atrib6
x7
y9
z10
The only reason I'm printing "digits" or "no digits" is as a marker to aid in building the object.

You can use the following:
switch -regex -file people.csv {
'\d' { "yes" ; $_ }
default { "no"; $_ }
}
\d is a regex character matching a digit. A switch statement with -regex allows for regex expressions to be used for matching text. The default condition is picked when no other condition is met. $_ is the current line being processed.
switch is generally faster than Get-Content for line by line processing. Since you do want to perform certain actions per line, you likely don’t want to use the -Raw parameter because that will read in all file contents as one single string.
# For Reverse Output
$output = switch -regex -file people.csv {
'\d' { "yes" ; $_ }
default { "no"; $_ }
}
$output[($output.GetUpperBound(0))..0)]

Related

PowerShell: Pattern Matching Text File Contents to Insert into .CSV

I have been struggling to successfully break apart contents of a text file and insert them into a .csv with the following rules:
The line containing '>' should be inserted into .csv column 1
The lines containing all caps should be inserted into .csv column 2 and each block of capital letters should be joined (have its `r or `n removed)
'>' and '*' should be removed where present
Separately, I can get column 1 to work fairly well using:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]#{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
and limited success using similar for column 2 (I can't seem to get -join to work with `r or `n):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]#{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
But it escapes me how to get both to work in the same code block to iterate over each section delimited by '>' and/or '*'.
Below is a sample of the data for reference.
>9392290|2983921
FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA
GDFOUYUIOAGHIHUAGSD
>lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1
FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW
DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS
FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS
DFSAQSAHUSDFAHOUHGROUGRWE*
>jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31
ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI
GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA
AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO
FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP
GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG
GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA
GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW
EGWIOJNENIOPIOWINPEAWNPOI*
I suggest using a -split operation:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] #{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display output:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…

Powershell script using RegEx to look for a pattern in one .txt and find line in a second .txt

I have a real "headsmasher" on my plate.
I have this piece of script:
$lines = Select-String -List -Path $sourceFile -Pattern $pattern -Context 20
foreach ($id in $lines) {
if (Select-String -Quiet -LiteralPath export.txt -Pattern "$($Matches[1]).+$($id.Pattern)") {
}
else {
Select-String -Path $sourceFile -Pattern $pattern -Context 20 >> $duplicateTransactionsFile
}
}
but it is not working for me as I wanted it to.
I have two .txt files: "$sourcefile = source.txt" and "export.txt"
The source.txt looks like something like this:
Some text here ***********
------------------------------------------------
F I N A L C O U N T 1 9 , 9 9
**************
** [0000123456]
ID Number:0000123456
Complete!
****************!
***********
Some other text here*******
------------------------------------------------
F I N A L C O U N T 9 , 9 9
**********
** [0000789000]
ID Number:0000789000
Complete!
******************!
************
The export.txt is like this:
0000123456 19,99
0000555555 ,89
0000666666 3,05
0000777777 31,19
0000789000 9,99
What I am trying to do is look into source.txt and search for the number that I enter (spaced out in my case)
*e.g: "9,99" but only that. As you can see, the next number in the source.txt is "19,99" and it also contains "9,99" but I do not want it to be matched.
and once I find the number, look for the next line in the source.txt that contains the text "ID Number:" then get the numbers right after the ":" Once I get those numbers after the ":", I want to now look into the export.txt and see if the numbers after the ":" are there and whether it has the "9,99" on the same line next to it but exactly "9,99" and nothing else lie "19,99", "29,99", and so on.
Then the rest is easy:
if (*true*) {
do this
}
else {
do that
}
Could you guys give me some love here and help a brother out?
I very much appreciate any help or hint you could share.
Best of wishes!
You could approach this like below:
# read the export.txt file and convert to a Hashtable for fast lookup
$export = ((Get-Content -Path 'D:\Test\export.txt').Trim() -replace '\s+', '=') -join "`r`n" | ConvertFrom-StringData
# read the source file and split into multiline data blocks
$source = ((Get-Content -Path 'D:\Test\source.txt' -Raw) -split '-{2,}').Trim() | Where-Object { $_ -match '(?sm)^\s?F I N A L C O U N T' }
# make sure the number given is spaced-out
$search = (((Read-Host "Search for Final Count number") -replace '\s' -split '') -join ' ').Trim()
Write-Host "Looking for a matching item using Final Count '$search'"
# see if we can find a data block that matches the $search
$blocks = $source | Where-Object { $_ -match "(?sm)^F I N A L C O U N T\s+$search\s?$" }
if (!$blocks) {
Write-Host "No item in source.txt could be found with Final Count '$search'" -ForegroundColor Red
}
else {
# loop over the data block(s) and pick the one that matches the search count
$blocks | ForEach-Object {
# parse out the ID
$id = $_ -replace '(?sm).*ID Number:(\d+).*', '$1'
# check if the $export Hashtable contains a key with that ID number
if ($export.Contains($id)) {
# check if that item has a value of $search without the spaces
if ($export[$id] -eq ($search -replace '\s')) {
# found it; do something
Write-Host "Found a match in the export.txt" -ForegroundColor Green
}
else {
# found ID with different FinalCount
Write-Host "An item with ID '$id' was found, but with different Final Count ($($export[$id]))" -ForegroundColor Red
}
}
else {
# ID not found
Write-Host "No item with ID '$id' could be found in the export.txt" -ForegroundColor Red
}
}
}
If as per your comment, you would like the code to loop over the Final Count numbers found in the source.txt file instead of a user typing in a number to search for, you can shorten the above code to:
# read the export.txt file and convert to a Hashtable for fast lookup
$export = ((Get-Content -Path 'D:\Test\export.txt').Trim() -replace '\s+', '=') -join "`r`n" | ConvertFrom-StringData
# read the source file and split into multiline data blocks
$blocks = ((Get-Content -Path 'D:\Test\source.txt' -Raw) -split '-{2,}').Trim() |
Where-Object { $_ -match '(?sm)^\s?F I N A L C O U N T' }
if (!$blocks) {
Write-Host "No item in source.txt could be found with Final Count '$search'" -ForegroundColor Red
}
else {
# loop over the data block(s)
$blocks | ForEach-Object {
# parse out the FINAL COUNT number to look for in the export.txt
$search = ([regex]'(?sm)^F I N A L C O U N T\s+([\d,\s]+)$').Match($_).Groups[1].Value
# remove the spaces, surrounding '0' and trailing comma (if any)
$search = ($search -replace '\s').Trim('0').TrimEnd(',')
Write-Host "Looking for a matching item using Final Count '$search'"
# parse out the ID
$id = $_ -replace '(?sm).*ID Number:(\d+).*', '$1'
# check if the $export Hashtable contains a key with that ID number
if ($export.Contains($id)) {
# check if that item has a value of $search without the spaces
if ($export[$id] -eq $search) {
# found it; do something
Write-Host "Found a match in the export.txt with ID: $($export[$id])" -ForegroundColor Green
}
else {
# found ID with different FinalCount
Write-Host "An item with ID '$id' was found, but with different Final Count ($($export[$id]))" -ForegroundColor Red
}
}
else {
# ID not found
Write-Host "No item with ID '$id' could be found in the export.txt" -ForegroundColor Red
}
}
}
There are surely multiple valid ways to accomplish this. Here is my approach:
(See comments for explanations. Let me know if you have any questions)
param (
# You can provide this when calling the script using "-Search 9,99"
# If not privided, powershell will prompt to enter the value
[Parameter(Mandatory)]
$Search,
$Source = "source.txt",
$Export = "export.txt"
)
# insert spaces
$pattern = $Search.ToCharArray() -join " "
# Search for the value in the source file.
$found = $false
switch -Regex -File $Source {
# This regex looks for something that is not a number,
# followed by only whitespace, and then your (spaced) search value.
# This makes sure "19,99" is not matched with "9,99".
# You could use a more elaborate regex here, but for your example,
# this one should work fine.
"\D\s+$pattern" {
$found = $true
}
"ID Number:(\d+)" {
# Get the ID number from the match.
$id = $Matches[1]
# If the search value was found
# (that means, this ID number is immediately followed by the search value)
# we can stop looking.
if ($found) {
break
}
}
}
# quick check if the value was actually found
if (-not $found) {
throw "Value $Search not found in $Source."
}
# Search for the id in the export file.
switch -Regex -File $Export {
"$id\s+(\S+)" {
# Get the amount value from the match
$value = $Matches[1]
# If the value matches your search...
if ($value -eq $search) {
# do this
}
else {
# otherwise do that
}
break
}
}
Note: You could additionally convert the values to decimal to account for different text representations when searching and comparing.

Powershell Regex Function Deleting CRLFs

I am using the below function to create a JSON file from a SQL file. Unfortunately it is deleting the CRLF at the end of each line of the SQL file. I want it to keep them instead.
function GetStringBetweenTwoStrings($firstString, $secondString, $importPath){
>>
>> #Get content from file
>> $file = Get-Content $importPath
>>
>> #Regex pattern to compare two strings
>> $pattern = "$firstString(.*?)$secondString"
>>
>> #Perform the opperation
>> $result = [regex]::Match($file,$pattern).Groups[1].Value
>>
>> #Return result
>> return "{""sql"":"""+$result+"""}"
>>
>> }
I have tried using -raw but it does not seem to work
Thanks,
John
Interesting question
Unfortunately, I couldn't figure out a way to keep CRLF characters from `[regex]::Match` command.
It captures them fine but seems to return them as a single string by default.
If someone can figure that out, I'd be glad to see it.
Thanks to people much smarter than me, the following way with [regex]::match seems to work
function Get-StringBetweenTwoStrings {
[cmdletBinding()]
param (
$firstString,
$secondString,
$fullString
)
# Get content from file WITH -RAW
$file = Get-Content -Path $fullString -Raw
Write-Verbose $file -Verbose
# Regex pattern to compare two strings
$pattern = '{0}(.*?){1}' -f $firstString, $secondString
Write-Verbose $pattern -Verbose
# Perform the operation
$result = [regex]::Match($file, $pattern, 'SingleLine, MultiLine, IgnoreCase').Value
# Result
"{""sql"":""$result""}"
}
Test the code
Get-StringBetweenTwoStrings -firstString '(?<=GO)' -secondString '(?=GO)' -fullString .\Downloads\test.txt
Image
Workaround
When all else fails, I go back to brute force.
Start capturing when we see our $firstString, and keep capturing until we find our $secondString or reach the end.
Sample Data
$s = #'
# This is a random comment
GOSELECT TOP (1)
*
FROM dbo.Users
WHERE CaffeineLevel = 'Low';
# Can we get a cafGOfeine drip?
GO
# Why isn't this easier
'# -split '\r?\n'
Code
$capture = [Text.StringBuilder]::new()
$capturing = $false
$firstString = 'GO'
$secondString = 'GO'
foreach ($line in $s) {
if ($line -match $secondString -and $capturing) {
Write-Verbose "Stopping...$line" -Verbose
<#
In case we want to capture a partial line
look for everything UNTIL our second string
#>
$splitLine = ($line | Select-String -Pattern ".*(?=$secondString)").Matches.Value
Write-Verbose "Capturing: [$splitLine]" -Verbose
$null = $capture.AppendLine($splitLine)
$capturing = $false
<# second string found, stop altogether #>
break
}
if ($capturing) {
Write-Verbose "Capturing: [$line]" -Verbose
$null = $capture.AppendLine($line)
}
if ($line -match $firstString) {
Write-Verbose "Starting...$line" -Verbose
<#
In case we want to capture a partial line,
look for everything AFTER our first string
#>
$splitLine = ($line | Select-String -Pattern "(?<=$firstString).*").Matches.Value
Write-Verbose "Capturing: [$splitLine]" -Verbose
$null = $capture.AppendLine($splitLine)
$capturing = $true
}
}
$capture.ToString()
Dirty Testing Results

Regular expression in power shell to convert character between double quotes to upper case

I want to write a powershell script which will convert a string which is present between double quotes in a file, and convert it into upper case.
The files are placed in different folders.
I am able to extract the string between the double quotes and convert it to upper case, but not able to replace it in the correct position.
Ex : This is the input string.
"e" //&&'i&&
The output should be
"E" //&&'i&&
This is what i have tried. Also this even i not replacing the content of the file.
$items = Get-ChildItem * -recurse
# enumerate the items array
foreach ($item in $items)
{
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory")
{
(Get-Content $item.FullName ) |
Foreach-Object {
if (($_ -match '\"'))
{
$str = $_
$ext = [regex]::Matches($str, '".*?"').Value -replace '"'
$ext = $ext.ToUpper()
Write-Host $ext
$_ = $ext
}
else { }
} |
Set-Content $item.FullName
}
}
This can do it. Really I wasn't following your code so I stripped it and modified the regex.
$items = Get-ChildItem "C:\Users\UsernameHere\Desktop\Folder123\*.txt"
# enumerate the items array
foreach ($item in $items){
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory"){
$content = (gc $item.FullName )
$content = $content.replace('"\w.*"',$matches[0].ToUpper)
$content | sc $item
}
}
If you had powershell 6 or 7:
'"hi"' -replace '".*"', { $_.value.toupper() }
"HI"
'"e" //&&''i&&' -replace '".*"', { $_.value.toupper() }
"E" //&&'i&&
I am able to print the upper case characters with the below code, but the file is not getting updated. It still has the old characters, How to update the fie with new contents.
$items = Get-ChildItem *.txt -recurse
# enumerate the items array
foreach ($item in $items)
{
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory")
{
(Get-Content $item.FullName ) |
Foreach-Object {
$str = $_
$_ = [regex]::Replace($_, '"[^"]*"', { param($m) $m.Value.ToUpper() })
Write-Host $_
} |
Set-Content $item.FullName
}
}

powershell: Append text before specific line instead of after

I'm looking for a way to add text before a line.
To be more specific, Before a line and a blank space.
Right now the scripts adds my text after the line [companyusers].
But I'd like to add the line before [CompanytoEXT] and before the blank space above [CompanytoEXT].
Does any body know how to do this?
Visual representation of what I'd want to do: https://imgur.com/a/lgH5i
My current script:
$FileName = "C:\temptest\testimport - Copy.txt"
$Pattern = "[[\]]Companyusers"
$FileOriginal = Get-Content $FileName
[String[]] $FileModified = #()
Foreach ($Line in $FileOriginal)
{
$FileModified += $Line
if ($Line -match $pattern)
{
#Add Lines after the selected pattern
$FileModified += "NEWEMAILADDRESS"
}
}
Set-Content $fileName $FileModified
Thanks for any advice!
Even if you're just pointing me where to look for answers it will be very much appreciated.
This might be easier using an ArrayList, that way you can insert new data easily at a specific point:
$FileName = "C:\temptest\testimport - Copy.txt"
$Pattern = "[[\]]Companyusers"
[System.Collections.ArrayList]$file = Get-Content $FileName
$insert = #()
for ($i=0; $i -lt $file.count; $i++) {
if ($file[$i] -match $pattern) {
$insert += $i-1 #Record the position of the line before this one
}
}
#Now loop the recorded array positions and insert the new text
$insert | Sort-Object -Descending | ForEach-Object { $file.insert($_,"NEWEMAILADDRESS") }
Set-Content $FileName $file
First open the file into an ArrayList, then loop over it. Each time you encounter the pattern, you can add the previous position into a separate array, $insert. Once the loop is done, you can then loop the positions in the $insert array and use them to add the text into the ArrayList.
You need a little state machine here. Note when you have found the correct section, but do not insert the line yet. Insert only at the next empty line (or the end of the file, if the section is the last in the file).
Haven't tested, but should look like this:
$FileName = "C:\temptest\testimport - Copy.txt"
$Pattern = "[[\]]Companyusers"
$FileOriginal = Get-Content $FileName
[String[]] $FileModified = #()
$inCompanyUsersSection = $false
Foreach ($Line in $FileOriginal)
{
if ($Line -match $pattern)
{
$inCompanyUsersSection = $true
}
if ($inCompanyUsersSection -and $line.Trim() -eq "")
{
$FileModified += "NEWEMAILADDRESS"
$inCompanyUsersSection = $false
}
$FileModified += $Line
}
# Border case: CompanyUsers might be the last sction in the file
if ($inCompanyUsersSection)
{
$FileModified += "NEWEMAILADDRESS"
}
Set-Content $fileName $FileModified
Edit: If you don't want to use the "insert at the next empty line" approach, because maybe your section may in clude empty lines, you can also trigger the insert at the beginning of the next section ($line.StartsWith("[")). However that would complicate things because now you have to look two lines ahead which means you have to buffer one line before writing it out. Doable but ugly.