How to stream text with powershell and regex match on multiline - regex

I have a text file that an application constantly errors to. I want to monitor this file with Powershell and log every error to another source.
Problem to solve: how do i pass multiline text when we are in -wait? Get-Content is passing arrays of strings.
$File = 'C:\Windows\Temp\test.txt'
$content = Get-Content -Path $file
# get stream of text
Get-Content $file -wait -Tail 0 | ForEach-Object {
if ($_ -match '(<ACVS_T>)((.|\n)*)(<\/ACVS_T>)+'){
write-host 'match found!'
}
}
Example of text junks that get drop:
<ACVS_T>
<ACVS_D>03/01/2017 17:24:03.602</ACVS_D>
<ACVS_TI>bf37ba1c9,iSTAR Server Compone</ACVS_TI>
<ACVS_C>ClusterPort</ACVS_C>
<ACVS_S>SoftwareHouse.NextGen.HardwareInterface.Nantucket.Framework.ClusterPort.HandleErrorState( )
</ACVS_S>
<ACVS_M>
ERROR MESSAGE FROM APP
</ACVS_M>
<ACVS_ST>
</ACVS_ST>
</ACVS_T>

solved it!
$File = 'D:\Program Files (x86)\Tyco\CrossFire\Logging\SystemTrace.Log'
$content = Get-Content -Path $file
# get stream of text
$text = ''
Get-Content $file -wait -Tail 0 | ForEach-Object {
$text +=$_
if ($text -match '(<ACVS_T>)((.|\n)*)(<\/ACVS_T>)+'){
[xml]$XML = "<Root>" + $text + "</Root>"
$text='' #clear it for next one
$XML.Root.ACVS_T | ForEach-Object {
$Obj = '' | Select-Object -Property ACVS_D, ACVS_TI, ACVS_C, ACVS_S, ACVS_M, ACVS_ST
$Obj.ACVS_D = $_.ACVS_D
$Obj.ACVS_ST = $_.ACVS_ST
$Obj.ACVS_C = $_.ACVS_C
$Obj.ACVS_S = $_.ACVS_S
$Obj.ACVS_M = $_.ACVS_M
$Obj.ACVS_ST = $_.ACVS_ST
write-host "`n`n$($Obj.ACVS_M)"
}
}
}

Related

Powershell: Replace only fist occurence of a line/string in entire file

I have following beggining of a Powershell script in which I would like to replace the values of variables for different enviroment.
$SomeVar1 = "C:\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewriten
$SomeVar2 = "C:\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
When I run the rewrite script, the result should look like this:
$SomeVar1 = "F:\different\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewrite
$SomeVar2 = "F:\different\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
Current script that does(n't) rewrite:
$arr = #(
[PSCustomObject]#{Regex = '$SomeVar1 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\a"'}
[PSCustomObject]#{Regex = '$SomeVar2 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\b"'}
)
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
# Get full line for replacement
$Line = Get-Content $Workfile | Select-String $ArrRegex | Select-Object -First 1 -ExpandProperty Line
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile
$Content -replace "^$Line",$ArrReplace | Set-Content $Workfile
}
This replaces all the occurences in file on the start of the line (and I need only the 1st one) and doest not replace the one in Note which is okay.
Then I found this Powershell: Replace last occurence of a line in a file which does the exact oposite of what I need, only rewrites the last occurence of the string and it does it in the Note aswell and I would somehow like to change it to do the opposite - 1st occurence, line begining (Wont target the Note)
Code in my case looks like this:
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile -Raw
$Line = "(?s)(.*)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line,$ArrReplace | Set-Content $Workfile
Do you have any recommendations on how to archive my goal, or is there a more sothisticated way to replace variables for powershell scripts like this?
Thanks in advance.
So I finally figured it out, I had to add Select-String "^$ArrRegex" during $Line creation to exclude any string that were on on line beggining and then use this Regex to do the job: ^(?s)(.*?\n)$Line
In my case it does the following: Only selects 1st occurnece on the beggining of the line and replaces it. It ignores everything else and when re-run, does not rewrite others. The copies of vars will not really exist in final version and will be set once like $Var1 = "Value" and never changed during script, but I wanted to be sure that I won't replace something by mistake.
The final replacing part does look like this:
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
$Line = Get-Content $Workfile | Select-String "^$ArrRegex" | Select-Object -First 1 -ExpandProperty Line
$Line = [Regex]::Escape($Line)
$Line = "^(?s)(.*?\n)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line, $ArrReplace | Set-Content $Workfile
}
You could possibly use flag variables like below to only do the first replacement for each of your regex patterns.
$Altered = Get-Content -Path $Workfile |
Foreach-Object {
if(-not $a) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX1','YOUR_REPLACEMENT1'
if($_ -match 'YOUR_REPLACEMENT1') { $a = 'replacement done' } #Set Flag
}
if(-not $b) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX2','YOUR_REPLACEMENT2'
if($_ -match 'YOUR_REPLACEMENT2') { $b = 'replacement done' } #Set Flag
}
$_ # Pipe back to $Altered
}
$Altered | Set-Content -Path $WorkFile
Just reverse the RegEx, if that is what you are after:
Clear-Host
#'
abc
abc
abc
'# -replace '^(.*?)\babc\b', '$1HelloWorld'
# Results
<#
HelloWorld
abc
abc
#>

PowerShell: Pattern Matching Text File Contents to Insert into .CSV

I have been struggling to successfully break apart contents of a text file and insert them into a .csv with the following rules:
The line containing '>' should be inserted into .csv column 1
The lines containing all caps should be inserted into .csv column 2 and each block of capital letters should be joined (have its `r or `n removed)
'>' and '*' should be removed where present
Separately, I can get column 1 to work fairly well using:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]#{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
and limited success using similar for column 2 (I can't seem to get -join to work with `r or `n):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]#{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
But it escapes me how to get both to work in the same code block to iterate over each section delimited by '>' and/or '*'.
Below is a sample of the data for reference.
>9392290|2983921
FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA
GDFOUYUIOAGHIHUAGSD
>lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1
FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW
DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS
FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS
DFSAQSAHUSDFAHOUHGROUGRWE*
>jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31
ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI
GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA
AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO
FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP
GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG
GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA
GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW
EGWIOJNENIOPIOWINPEAWNPOI*
I suggest using a -split operation:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] #{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display output:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…

Powershell Regex Function Deleting CRLFs

I am using the below function to create a JSON file from a SQL file. Unfortunately it is deleting the CRLF at the end of each line of the SQL file. I want it to keep them instead.
function GetStringBetweenTwoStrings($firstString, $secondString, $importPath){
>>
>> #Get content from file
>> $file = Get-Content $importPath
>>
>> #Regex pattern to compare two strings
>> $pattern = "$firstString(.*?)$secondString"
>>
>> #Perform the opperation
>> $result = [regex]::Match($file,$pattern).Groups[1].Value
>>
>> #Return result
>> return "{""sql"":"""+$result+"""}"
>>
>> }
I have tried using -raw but it does not seem to work
Thanks,
John
Interesting question
Unfortunately, I couldn't figure out a way to keep CRLF characters from `[regex]::Match` command.
It captures them fine but seems to return them as a single string by default.
If someone can figure that out, I'd be glad to see it.
Thanks to people much smarter than me, the following way with [regex]::match seems to work
function Get-StringBetweenTwoStrings {
[cmdletBinding()]
param (
$firstString,
$secondString,
$fullString
)
# Get content from file WITH -RAW
$file = Get-Content -Path $fullString -Raw
Write-Verbose $file -Verbose
# Regex pattern to compare two strings
$pattern = '{0}(.*?){1}' -f $firstString, $secondString
Write-Verbose $pattern -Verbose
# Perform the operation
$result = [regex]::Match($file, $pattern, 'SingleLine, MultiLine, IgnoreCase').Value
# Result
"{""sql"":""$result""}"
}
Test the code
Get-StringBetweenTwoStrings -firstString '(?<=GO)' -secondString '(?=GO)' -fullString .\Downloads\test.txt
Image
Workaround
When all else fails, I go back to brute force.
Start capturing when we see our $firstString, and keep capturing until we find our $secondString or reach the end.
Sample Data
$s = #'
# This is a random comment
GOSELECT TOP (1)
*
FROM dbo.Users
WHERE CaffeineLevel = 'Low';
# Can we get a cafGOfeine drip?
GO
# Why isn't this easier
'# -split '\r?\n'
Code
$capture = [Text.StringBuilder]::new()
$capturing = $false
$firstString = 'GO'
$secondString = 'GO'
foreach ($line in $s) {
if ($line -match $secondString -and $capturing) {
Write-Verbose "Stopping...$line" -Verbose
<#
In case we want to capture a partial line
look for everything UNTIL our second string
#>
$splitLine = ($line | Select-String -Pattern ".*(?=$secondString)").Matches.Value
Write-Verbose "Capturing: [$splitLine]" -Verbose
$null = $capture.AppendLine($splitLine)
$capturing = $false
<# second string found, stop altogether #>
break
}
if ($capturing) {
Write-Verbose "Capturing: [$line]" -Verbose
$null = $capture.AppendLine($line)
}
if ($line -match $firstString) {
Write-Verbose "Starting...$line" -Verbose
<#
In case we want to capture a partial line,
look for everything AFTER our first string
#>
$splitLine = ($line | Select-String -Pattern "(?<=$firstString).*").Matches.Value
Write-Verbose "Capturing: [$splitLine]" -Verbose
$null = $capture.AppendLine($splitLine)
$capturing = $true
}
}
$capture.ToString()
Dirty Testing Results

Regular expression in power shell to convert character between double quotes to upper case

I want to write a powershell script which will convert a string which is present between double quotes in a file, and convert it into upper case.
The files are placed in different folders.
I am able to extract the string between the double quotes and convert it to upper case, but not able to replace it in the correct position.
Ex : This is the input string.
"e" //&&'i&&
The output should be
"E" //&&'i&&
This is what i have tried. Also this even i not replacing the content of the file.
$items = Get-ChildItem * -recurse
# enumerate the items array
foreach ($item in $items)
{
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory")
{
(Get-Content $item.FullName ) |
Foreach-Object {
if (($_ -match '\"'))
{
$str = $_
$ext = [regex]::Matches($str, '".*?"').Value -replace '"'
$ext = $ext.ToUpper()
Write-Host $ext
$_ = $ext
}
else { }
} |
Set-Content $item.FullName
}
}
This can do it. Really I wasn't following your code so I stripped it and modified the regex.
$items = Get-ChildItem "C:\Users\UsernameHere\Desktop\Folder123\*.txt"
# enumerate the items array
foreach ($item in $items){
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory"){
$content = (gc $item.FullName )
$content = $content.replace('"\w.*"',$matches[0].ToUpper)
$content | sc $item
}
}
If you had powershell 6 or 7:
'"hi"' -replace '".*"', { $_.value.toupper() }
"HI"
'"e" //&&''i&&' -replace '".*"', { $_.value.toupper() }
"E" //&&'i&&
I am able to print the upper case characters with the below code, but the file is not getting updated. It still has the old characters, How to update the fie with new contents.
$items = Get-ChildItem *.txt -recurse
# enumerate the items array
foreach ($item in $items)
{
# if the item is a directory, then process it.
if ($item.Attributes -ne "Directory")
{
(Get-Content $item.FullName ) |
Foreach-Object {
$str = $_
$_ = [regex]::Replace($_, '"[^"]*"', { param($m) $m.Value.ToUpper() })
Write-Host $_
} |
Set-Content $item.FullName
}
}

Powershell search matching string in word document

I have a simple requirement. I need to search a string in Word document and as result I need to get matching line / some words around in document.
So far, I could successfully search a string in folder containing Word documents but it returns True / False based on whether it could find search string or not.
#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path = "c:\MORLAB"
$files = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$output = "c:\wordfiletry.txt"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "CRHPCD01"
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
$wordFound = $range.find.execute($findText)
if($wordFound)
{
"$file.fullname has $wordfound" | Out-File $output -Append
}
}
$document.close()
$application.quit()
}
getStringMatch
#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path = "c:\Temp"
$files = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$output = "c:\temp\wordfiletry.csv"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "First"
$charactersAround = 30
$results = #{}
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
If($range.Text -match ".{$($charactersAround)}$($findtext).{$($charactersAround)}"){
$properties = #{
File = $file.FullName
Match = $findtext
TextAround = $Matches[0]
}
$results += New-Object -TypeName PsCustomObject -Property $properties
}
}
If($results){
$results | Export-Csv $output -NoTypeInformation
}
$document.close()
$application.quit()
}
getStringMatch
import-csv $output
There are a couple of ways to get what you want. A simple approach is since you have the text of the document already lets perform a regex match on it and return the results and more. This helps in trying to address getting some words around in document.
We have the variable $charactersAround which sets the number of characters to match around the $findtext. Also I though the output was a better fit for a CSV file so I used $results to capture a hashtable of properties that, in the end, are output to a csv file.
Be sure to change the variables for your own testing. Now that we are using regex to locate the matches this opens up a world of possibilities.
Sample Output
Match TextAround File
----- ---------- ----
First dley Air Services Limited dba First Air meets or exceeds all term C:\Temp\20120315132117214.docx
Thanks! You provided a great solution to use PowerShell regex expressions to look for information in a Word document. I needed to modify it to meet my needs. Maybe, it will help someone else. It reads each line of the word document, and then uses the regex expression to determine if the line is a match. The output could easily be modified or dumped to a log file.
Set-StrictMode -Version latest
$path = "c:\Temp\pii"
$files = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "[0-9]" #regex
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files) {
$document = $application.documents.open($file.FullName,$false,$true)
$arrContents = $document.content.text.split()
$varCounter = 0
ForEach ($line in $arrContents) {
$varCounter++
If($line -match $findtext) {
"File: $file Found: $line Line: $varCounter"
}
}
$document.close()
}
$application.quit()
}
getStringMatch
Good answer from #Matt.
I improved it a little (new PowerShell version have problems with the given array. And to search big amount of documents it runs out of memory.
Here is my improved version:
#ERROR REPORTING ALL
Set-StrictMode -Version latest
$path = "c:\Temp"
$files = Get-Childitem $path -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
$output = "c:\temp\wordfiletry.csv"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "First"
$charactersAround = 30
$results = #{}
Function getStringMatch
{
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
If($range.Text -match ".{$($charactersAround)}$($findtext).{$($charactersAround)}"){
$properties = #{
File = $file.FullName
Match = $findtext
TextAround = $Matches[0]
}
$results += #(New-Object -TypeName PsCustomObject -Property $properties)
}
$document.close()
}
If($results){
$results | Export-Csv $output -NoTypeInformation
}
$application.quit()
}
getStringMatch
import-csv $output
Use the function like this:
PS> WordGrep -File ./Myfile.docx -Grep one, two, three
function WordGrep{
param(
[string]$File,
[string[]]$Grep,
[switch]$WordMode,
[switch]$EscapeMode
)
$WordApp = New-Object -comobject word.application
$WordApp.visible = $False
try {
$document = $WordApp.documents.open($File, $false, $true)
$arrContents = $document.content.text.split()
$found = $false
foreach ($line in $arrContents) {
foreach ($pattern in $Grep) {
if ($EscapeMode) {
$pattern = [Regex]::Escape($pattern)
}
if ($WordMode) {
$pattern = "\b${pattern}\b"
}
if ($line -imatch $pattern) {
write-host -ForegroundColor Cyan -NoNewLine "$file`:"
write-host " $line"
break;
}
}
}
$document.close()
}
finally {
$WordApp.quit()
}
}