Get the numbers after ":" and count them with the help of powershell - regex

Could someone please help me with extracting and counting the numbers from a text file with PowerShell?
Example: c:\temp\1.txt is some text with semicolon and numbers after them. I need to sum all of these numbers.
blablabl:5 dzfdsfdsfsdfsf:10
sdfsdfsdfdffs:8sdfsfsfdsfdsf:111
5+10+8+111...
What I've tried so far:
$LogText = "C:\temp\1.txt"
[regex]$Regex = "\. (\d+):[1]"
$Matches = $Regex.Matches($LogText)
$Matches | ForEach-Object {
Write-Host $Matches
}
#$array = #()
#$array = new-object collections.arraylist
$array = while ($Matches.Success) {
Write-Host $array[i++]
}
# -------------------------------------------------------------------
$text = Get-Content "C:\temp\1.txt"
[regex]$Regex = "\d"
$Matches = $Regex.Matches($text)
# -------------------------------------------------------------------
$pos = $text.IndexOf(":")
$rightPart = $text.Substring($pos+1)
Write-Host $rightPart

Use Select-String to extract the matches from the file and Measure-Object to do the calculation.
Select-String -Path 'C:\temp\1.txt' -Pattern '(?<=:)\d+' -AllMatches |
Select-Object -Expand Matches |
Select-Object -Expand Value |
Measure-Object -Sum |
Select-Object -Expand Sum
(?<=:) is a positive lookbehind assertion to match the colon preceding the number without making it part of the match.

Try it like that:
$txt=
#"
blablabl:5 dzfdsfdsfsdfsf:10
sdfsdfsdfdffs:8sdfsfsfdsfdsf:111
"#
[regex]$Regex = '\d+'
$sum=0;
$Regex.Matches($txt) | ForEach-Object {
$val = [int]$_.Value
$val
$sum+=$val
}
$sum

Related

Powershell: Replace only fist occurence of a line/string in entire file

I have following beggining of a Powershell script in which I would like to replace the values of variables for different enviroment.
$SomeVar1 = "C:\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewriten
$SomeVar2 = "C:\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
When I run the rewrite script, the result should look like this:
$SomeVar1 = "F:\different\path\to\file\a"
$SomeVar1 = "C:\path\to\file\a" # Copy for test - Should not be rewrite
$SomeVar2 = "F:\different\path\to\file\b"
# Note $SomeVar1 = "C:\path\to\file\a" - Should not be rewriten
Current script that does(n't) rewrite:
$arr = #(
[PSCustomObject]#{Regex = '$SomeVar1 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\a"'}
[PSCustomObject]#{Regex = '$SomeVar2 = "'; Replace = '$SomeVar1 = "F:\different\path\to\file\b"'}
)
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
# Get full line for replacement
$Line = Get-Content $Workfile | Select-String $ArrRegex | Select-Object -First 1 -ExpandProperty Line
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile
$Content -replace "^$Line",$ArrReplace | Set-Content $Workfile
}
This replaces all the occurences in file on the start of the line (and I need only the 1st one) and doest not replace the one in Note which is okay.
Then I found this Powershell: Replace last occurence of a line in a file which does the exact oposite of what I need, only rewrites the last occurence of the string and it does it in the Note aswell and I would somehow like to change it to do the opposite - 1st occurence, line begining (Wont target the Note)
Code in my case looks like this:
# Rewrite part
$Line = [Regex]::Escape($Line)
$Content = Get-Content $Workfile -Raw
$Line = "(?s)(.*)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line,$ArrReplace | Set-Content $Workfile
Do you have any recommendations on how to archive my goal, or is there a more sothisticated way to replace variables for powershell scripts like this?
Thanks in advance.
So I finally figured it out, I had to add Select-String "^$ArrRegex" during $Line creation to exclude any string that were on on line beggining and then use this Regex to do the job: ^(?s)(.*?\n)$Line
In my case it does the following: Only selects 1st occurnece on the beggining of the line and replaces it. It ignores everything else and when re-run, does not rewrite others. The copies of vars will not really exist in final version and will be set once like $Var1 = "Value" and never changed during script, but I wanted to be sure that I won't replace something by mistake.
The final replacing part does look like this:
for ($i = 0; $i -lt $arr.Length; $i++) {
$ArrRegex = [Regex]::Escape($arr[$i].Regex)
$ArrReplace = $arr[$i].Replace
$Line = Get-Content $Workfile | Select-String "^$ArrRegex" | Select-Object -First 1 -ExpandProperty Line
$Line = [Regex]::Escape($Line)
$Line = "^(?s)(.*?\n)$Line"
$ArrReplace = "`$1$ArrReplace"
$Content -replace $Line, $ArrReplace | Set-Content $Workfile
}
You could possibly use flag variables like below to only do the first replacement for each of your regex patterns.
$Altered = Get-Content -Path $Workfile |
Foreach-Object {
if(-not $a) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX1','YOUR_REPLACEMENT1'
if($_ -match 'YOUR_REPLACEMENT1') { $a = 'replacement done' } #Set Flag
}
if(-not $b) { #If replacement hasn't been done, replace
$_ = $_ -replace 'YOUR_REGEX2','YOUR_REPLACEMENT2'
if($_ -match 'YOUR_REPLACEMENT2') { $b = 'replacement done' } #Set Flag
}
$_ # Pipe back to $Altered
}
$Altered | Set-Content -Path $WorkFile
Just reverse the RegEx, if that is what you are after:
Clear-Host
#'
abc
abc
abc
'# -replace '^(.*?)\babc\b', '$1HelloWorld'
# Results
<#
HelloWorld
abc
abc
#>

Keep the $character in regular expression replace

Two problems of regular replace
1.need to keep the front $character in the replacement result
2.Skipping the first two lines and the last line is not valid
Code:
$str = #'
#$start1 Random characters
#$start2 Random characters
$p1.AppendBreak($BreakType.LineBreak)
$doc.Protect($ProtectionType.AllowOnlyRevisions, "123")
$footerPara.AppendField("page", $FieldType.FieldPage)
$footerParagraph.AppendField("number of pages", $FieldType.FieldSectionPages)
$txtWatermark.Layout = $WatermarkLayout.Diagonal
$tr1.CharacterFormat.Border.BorderType = $BorderStyle.DashDotStroker
$stri.CharacterFormat.TextBackgroundColor = $Color.LightGray
$document.LoadFromFile(".\Template_HtmlFile.html", $FileFormat.Html, $XHTMLValidationType.None)
$docObject.DocumentObjectType -eq $DocumentObjectType.Picture
$document.Sections[0].Paragraphs[0].InsertSectionBreak($SectionBreakType.NoBreak)
$footerParagraph.Format.HorizontalAlignment = $Spire.Doc.Documents.HorizontalAlignment.Right
#end Random characters
'#
$str | Foreach-Object {
$_ -replace '\$\w+\.(\w+)', '"$1"'
} | Set-Content .\ok.txt
<# -Skip -SkipLast not valid
$str | Foreach-Object {
$_ -replace '\$\w+\.(\w+)', '"$1"'
} | Select-Object -Skip 2 | Select-Object -SkipLast 1 | Set-Content .\ok.txt
#>
Expected results:
At least for your example here string, you need to break it into a string array. Then for the replacement I was only successful when capturing both the beginning and the desired changed text.
$str -split '\r?\n' | Select-Object -Skip 2 |
Select-Object -SkipLast 1 | Foreach-Object {
$_ -replace '(^.+?)\$.+\.(\w+)', '$1"$2"'
} | Set-Content .\ok.txt
Contents of ok.txt
$p1.AppendBreak("LineBreak")
$doc.Protect("AllowOnlyRevisions", "123")
$footerPara.AppendField("page", "FieldPage")
$footerParagraph.AppendField("number of pages", "FieldSectionPages")
$txtWatermark.Layout = "Diagonal"
$tr1.CharacterFormat.Border.BorderType = "DashDotStroker"
$stri.CharacterFormat.TextBackgroundColor = "LightGray"
$document.LoadFromFile(".\Template_HtmlFile.html", "None")
$docObject.DocumentObjectType -eq "Picture"
$document.Sections[0].Paragraphs[0].InsertSectionBreak("NoBreak")
$footerParagraph.Format.HorizontalAlignment = "Right"

PowerShell Regex with csv file

I'm currently trying to match a pattern of IDs and replace with 0 or 1.
example pc0045601234 replace with 1234 the last 4 and add the 3rd digit in front "01234"
I tried the code below but the out only filled the userid column with No matching employee
$reportPath = '.\report.csv'`$reportPath = '.\report.csv'`
$csvPath = '.\output.csv'
$data = Import-Csv -Path $reportPath
$output = #()
foreach ($row in $data) {
$table = "" | Select ID,FirstName,LastName,userid
$table.ID = $row.ID
$table.FirstName = $row.FirstName
$table.LastName = $row.LastName
switch -Wildcard ($row.ID)
{
{$row.ID -match 'P\d\d\d\d\d\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID -match 'SEC\d\d\d\D\D\D\D'} {$table.userid = "Contractor"; continue}
{$row.ID.StartsWith("P005700477")} {$table.userid = $row.ID -replace "P005700477","0477"; continue}
{$row.ID.StartsWith("P00570")} {$table.userid = $row.ID -replace "P00570","0"; continue}
default {$table.userid = "No Matching Employee"}
}
$output += $table
}
$output | Export-csv -NoTypeInformation -Path $csvPath
Here are three different ways to achieve the desired result. The first two use the same technique, just written in a different way.
First we put the sample data in a variable as a multiline string array. This is the equivalent as $text = Get-Content $somefile
$text = #'
PC05601234
PC15601234
'# -split [environment]::NewLine
Option 1 # convert to character array, select the 3rd and last 4 digits.
$text | foreach {-join ($_.ToChararray()| select -Skip 2 -First 1 -Last 4)}
Option 2 # same as above, requiring an extra -join to avoid spaces.
$text | foreach {(-join $_.ToChararray()| foreach{$_[2]+(-join $_[-4..-1])})}
Option 3 # my preference, regex. Capture the desired digits and replace the entire string with those two captured values.
$text -replace '^\D+(?!=\d)(\d)\w+([\d]{4}$)','$1$2'
All of these output
01234
11234
Further testing with different char/digit combinations and lengths.
$text = #'
PC05601234
PC15601234
PC0ABC124321
PC1DE4321
PC0A5678
PC1ABCD215678
'# -split [environment]::NewLine
Running the new sample data through each option all produce this output
01234
11234
04321
14321
05678
15678

How to stream text with powershell and regex match on multiline

I have a text file that an application constantly errors to. I want to monitor this file with Powershell and log every error to another source.
Problem to solve: how do i pass multiline text when we are in -wait? Get-Content is passing arrays of strings.
$File = 'C:\Windows\Temp\test.txt'
$content = Get-Content -Path $file
# get stream of text
Get-Content $file -wait -Tail 0 | ForEach-Object {
if ($_ -match '(<ACVS_T>)((.|\n)*)(<\/ACVS_T>)+'){
write-host 'match found!'
}
}
Example of text junks that get drop:
<ACVS_T>
<ACVS_D>03/01/2017 17:24:03.602</ACVS_D>
<ACVS_TI>bf37ba1c9,iSTAR Server Compone</ACVS_TI>
<ACVS_C>ClusterPort</ACVS_C>
<ACVS_S>SoftwareHouse.NextGen.HardwareInterface.Nantucket.Framework.ClusterPort.HandleErrorState( )
</ACVS_S>
<ACVS_M>
ERROR MESSAGE FROM APP
</ACVS_M>
<ACVS_ST>
</ACVS_ST>
</ACVS_T>
solved it!
$File = 'D:\Program Files (x86)\Tyco\CrossFire\Logging\SystemTrace.Log'
$content = Get-Content -Path $file
# get stream of text
$text = ''
Get-Content $file -wait -Tail 0 | ForEach-Object {
$text +=$_
if ($text -match '(<ACVS_T>)((.|\n)*)(<\/ACVS_T>)+'){
[xml]$XML = "<Root>" + $text + "</Root>"
$text='' #clear it for next one
$XML.Root.ACVS_T | ForEach-Object {
$Obj = '' | Select-Object -Property ACVS_D, ACVS_TI, ACVS_C, ACVS_S, ACVS_M, ACVS_ST
$Obj.ACVS_D = $_.ACVS_D
$Obj.ACVS_ST = $_.ACVS_ST
$Obj.ACVS_C = $_.ACVS_C
$Obj.ACVS_S = $_.ACVS_S
$Obj.ACVS_M = $_.ACVS_M
$Obj.ACVS_ST = $_.ACVS_ST
write-host "`n`n$($Obj.ACVS_M)"
}
}
}

Powershell parsing xml logfile & get currently parsed filename

I'm new with powershell and in need of guidance. Been scouring the site for answers and coming up blank, decided to ask instead. If this has been answered please refer me to the link.
I have an application log (xml format) like below:
<log><identifier>123axr4x5</identifier><login>USER1</login><source>Order-Management</source><AddlInfo>Execution Time : 20ms</AddlInfo><Exception></Exception><timestamp>01/01/2015:22:00:00</timestamp><serverticks>643670855</serverticks><PID>1234</PID><Machine>PRD01X12mm</Machine></log>
<log><identifier>dd8jksl3g</identifier><login>USER2</login><source>Service-Assurance</source><AddlInfo>Execution Time : 80ms</AddlInfo><Exception></Exception><timestamp>01/01/2015:22:00:00</timestamp><serverticks>643680865</serverticks><PID>1234</PID><Machine>PRD01X12mm</Machine></log>
: and so on
I am creating a log parser that will scan a folder and its subfolder for matching regex pattern, and based on certain threshold, output into gridview/export to CSV. I am almost done, however i'm unable to solve 1 problem, which is to get the filename currently being parsed, to be displayed on the gridview.
Basically i am using piped Get-ChildItem as below
Get-ChildItem $Dir -recurse -Filter *logging*.txt|
Sort-Object LastWriteTime |
?{$_.LastWriteTime -gt (Get-Date).AddMinutes(-60)}|
Select-String -Pattern $Text |
Select-String -Pattern $Text3 |
Select-String -Pattern $Text2 -allmatches |
Foreach-Object {
$information = $_|Select-Object -Property API, Duration,DataRetrieved, ServerTime, ServerTicks , Identifier, Filename
$information.Filename = $_.Name
#$information.Filename = $_.FullName
} |
Out-GridView
Below is the full code:
$Dir = "C:\log\"
$threshold = 1 + 0
$StartTime = (Get-Date).ToString();
$EndTime = (Get-Date).ToString();
$Text = "abc"
$Text2 = "def"
$Text3 = "ghi"
$OutFile = "result"
$OutPath = $Dir + $OutFile + ".txt"
#ExtractionParameters
$AddlInnfoTagBegin = "AddlInfo"
$AddlInnfoTagEnd = "/AddlInfo"
$ServerTimeOfLogTagBegin = "ServerTimeOfLog"
$ServerTimeOfLogTagEnd = "/ServerTimeOfLog"
$ServerTicksTagBegin = "ServerTicks"
$ServerTicksTagEnd = "/ServerTicks"
$IdentifierTagBegin = "Identifier"
$IdentifierTagEnd = "/Identifier"
#parse file in folders
Get-ChildItem $Dir -recurse -Filter *logging*.txt|
Sort-Object LastWriteTime |
#?{$_.LastWriteTime -gt (Get-Date).AddMinutes(-60)}|
Select-String -Pattern $Text |
Select-String -Pattern $Text3 |
Select-String -Pattern $Text2 -allmatches |
Foreach-Object {
# take line and split it at tabulators
$parts = $_.Line
#write $parts
$indexOfAddlInfoBegin = $parts.IndexOf($AddlInnfoTagBegin) + $AddlInnfoTagBegin.Length +1
$indexOfAddlInfoEnd = $parts.IndexOf($AddlInnfoTagEnd) -1
$AddlInfoData = $parts.Substring($indexOfAddlInfoBegin, $indexOfAddlInfoEnd - $indexOfAddlInfoBegin)
$AddlInfoReplaced = $AddlInfoData.Replace(" seconds ","#")
$AddlInfoSplit = $AddlInfoReplaced.Split('#')
$information = $_|Select-Object -Property API, Duration,DataRetrieved, ServerTime, ServerTicks , Identifier, Filename
#get filename, which does not work
$information.Filename = $_.Name
#$information.Filename = $_.FullName
$information.API = $AddlInfoSplit[0].Split(':')[0]
$information.DataRetrieved = $AddlInfoSplit[1]
$information.Duration = $AddlInfoSplit[0].Split(':')[1]
$information.Duration = $information.Duration.Replace("Execution Time = ","")
$indexOfServerTimeBegin = $parts.IndexOf($ServerTimeOfLogTagBegin) + $ServerTimeOfLogTagBegin.Length +1
$indexOfServerTimeEnd = $parts.IndexOf($ServerTimeOfLogTagEnd) -1
$ServerTimeData = $parts.Substring($indexOfServerTimeBegin, $indexOfServerTimeEnd - $indexOfServerTimeBegin)
$information.ServerTime = $ServerTimeData
$indexOfServerTicksBegin = $parts.IndexOf($ServerTicksTagBegin) + $ServerTicksTagBegin.Length +1
$indexOfServerTicksEnd = $parts.IndexOf($ServerTicksTagEnd) -1
$ServerTickData = $parts.Substring($indexOfServerTicksBegin, $indexOfServerTicksEnd - $indexOfServerTicksBegin)
$information.ServerTicks = $ServerTickData
$indexOfIdentifierBegin = $parts.IndexOf($IdentifierTagBegin) + $IdentifierTagBegin.Length +1
$indexOfIdentifierEnd = $parts.IndexOf($IdentifierTagEnd) -1
$IdentifierData = $parts.Substring($indexOfIdentifierBegin, $indexOfIdentifierEnd - $indexOfIdentifierBegin)
$information.Identifier = $IdentifierData
$DurationAsInt = 0 + $information.Duration
if($DurationAsInt -gt $threshold) {
write $information
}
} |
Out-GridView
#Out-File -FilePath $OutPath -Append -Width 200
Any help is appreciated, thanks!!
-CL
The property you are looking for is "FileName".
$information.Filename = $_.FileName
Powershell provides a cmdlet "Get-Member" which would list all available properties/methods. You could enumerate the members to console and inspect what is available
Write-Host ( $_ | Get-Member)