Powershell regex only select digits - regex

I have a script that I am working on to parse each line in the log. My issue is the regex I use matches from src= until space.
I only want the ip address not the src= part. But I do still need to match from src= up to space but in the result only store digits. Below is what I use but it sucks really badly. So any help would be helpful
#example text
$destination=“src=192.168.96.112 dst=192.168.5.22”
$destination -match 'src=[^\s]+'
$result = $matches.Values
#turn it into string since trim doesn’t work
$result=echo $result
$result=$result.trim(“src=”)

You can use a lookbehind here, and since -match only returns the first match, you will be able to access the matched value using $matches[0]:
$destination -match '(?<=src=)\S+' | Out-Null
$matches[0]
# => 192.168.96.112
See the .NET regex demo.
(?<=src=) - matches a location immediately preceded with src=
\S+ - one or more non-whitespace chars.
To extract all these values, use
Select-String '(?<=src=)\S+' -input $destination -AllMatches | Foreach {$_.Matches} | Foreach-Object {$_.Value}
or
Select-String '(?<=src=)\S+' -input $destination -AllMatches | % {$_.Matches} | % {$_.Value}

Another way could be using a capturing group:
src=(\S+)
Regex demo | Powershell demo
For example
$destination=“src=192.168.96.112 dst=192.168.5.22”
$pattern = 'src=(\S+)'
Select-String $pattern -input $destination -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value}
Output
192.168.96.112
Or a bit more specific matching the dot and the digits (or see this page for an even more specific match for an ip number)
src=(\d{1,3}(?:\.\d{1,3}){3})

Related

Regex Powershell shows too much

I am new to powershell. I am trying to automate my work a bit and need simple extraction of following pattern from all filetypes:
([0-9A-Z]{2,4}.[0-9A-Z]{8}.[0-9A-Z]{8}.[0-9A-Z]{4})
Example:
*lots of text*
X-xdaemon-transaction-id: string=9971.0A67341C.6147B834.0043,ee=3,shh,rec=0.0,recu=0.0,reid=0.0,cu=3,cld=1
X-xdaemon-transaction-id: string=AA71.0A67341C.6147B442.0043,ee=3,shh,rec=0.0,recu=0.0,reip=0.0,cu=3,cld=1
*lots of text*
Unfortunately, I am receiving output like this:
1mAAAA-0005nG-TN-H:220:
X-xdaemon-transaction-id: string=AA71.0A67341C.6147B442.0043,ee=3,shh,rec=0.0,recu=0.0,reip=0.0,cu=3,cld=1
my 'code' is as following:
Select-String -Path C:\Samples\* -Pattern "(0001.[0-9A-Z]{8}.[0-9A-Z]{8}.[0-9A-Z]{4})" -CaseSensitive
And I'd like to receive only the patterns: AA71.0A67341C.6147B442.0043 without anything added
Thanks for any help!
You can use
$rx = '\b[0-9A-Z]{2,4}\.[0-9A-Z]{8}\.[0-9A-Z]{8}\.[0-9A-Z]{4}\b'
Select-String -AllMatches -Pattern $rx -Path 'C:\Samples\*' -CaseSensitive | % { $_.matches.value }
That is,
Add word boundaries to match your expected strings as whole words and escape the literal . chars
Use -AllMatches (to get multiple matches per line if any) and access each resulting object match value with $_.matches.value.
PS test:
PS C:\Users\admin> $B = Select-String -AllMatches -Pattern '\b[0-9A-Z]{2,4}\.[0-9A-Z]{8}\.[0-9A-Z]{8}\.[0-9A-Z]{4}\b' -Path 'C:\Samples\*' -CaseSensitive | % { $_.matches.value }
PS C:\Users\admin> $B
9971.0A67341C.6147B834.0043
AA71.0A67341C.6147B442.0043
PS C:\Users\admin>
try:
$find = Get-ChildItem *.txt | Select-String -Pattern '\b[0-9A-Z]{2,4}.[0-9A-Z]{8}.[0-9A-Z]{8}.[0-9A-Z]{4}\b' -CaseSensitive
$find.Matches.Value

Powershell capture multiple values?

The following code returns only one match.
$s = 'x.a,
x.b,
x.c
'
$s -match 'x\.(.*?)[,$]'
$Matches.Count # return 2
$Matches[1] # returns a only
Excepted to return a, b, c.
The -match operator only finds the first match. The -AllMatches with Select-String will fetch all matches in the input. Also, [,$] matches a , or $ literal chars, the $ is not a string/line end metacharacter.
A possible solution may look like
Select-String 'x\.([^,]+)' -input $s -AllMatches | Foreach {$_.Matches} | Foreach-Object {$_.Groups[1].Value}
The pattern is x\.([^,]+), it matches x. and then captures into Group 1 any one or more chars other than ,.

Select-string apply to multiple lines of input

I have a file like:
abc WANT THIS
def NOT THIS
ghijk WANT THIS
lmno DO NOT LIKE
pqr WANT THIS
...
From which I want to extract:
abc
ghijk
pqr
When I apply the following:
(Select-String -Path $infile -Pattern "([^ ]+) WANT THIS").Matches.Groups[1].Value >$outfile
It only returns the match for the first line:
abc
(adding -AllMatches did not change the behaviour)
You may use
Select-String -path $infile -Pattern '^\s*(\S+) WANT THIS' -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value} > $outfile
The ^\s*(\S+) WANT THIS pattern will match
^ - start of a line
\s* - 0+ whitespaces
(\S+) - Group 1: one or more non-whitespace chars
WANT THIS - a literal substring.
Now, -AllMatches will collect all matches, then, you need to iterate over all matches with Foreach-Object {$_.Matches} and access Group 1 values (with Foreach-Object {$_.Groups[1].Value}), and then save the results to the output file.
Re-reading the code, its matching them all, but only writing the value of the first match (doh!):
(Select-String -Path $scriptfile -Pattern "([^ ]+) WANT THIS").Matches.Groups.Value >$tmpfile
OTOH, it appears that the "captures" in the pattern output object also contain the "non-captured" content!!!!

How to modify this regex to work in Powershell

So I have this regex https://regex101.com/r/xG8oX2/2 which gives me the matches I want.
But when I run this powershell script, it give me no matches. What should I modify in this regex to be able to get the same matches in powershell?
$pattern2 = '\d{4}\/\d{2}\/\d{2}.*]\s(?<reportHash>.*):.*Start.*\r*\n*.*\n.*ReportLayoutID=(\d{1,7})';
$reportLayoutIDList = Get-Content -Path bigOptions.txt | Out-String |
Select-String -Pattern $pattern2 -AllMatches |
Select-Object -ExpandProperty Matches |
Select-Object #{n="ReportHash";e={$_.Groups["reportHash"]}},
#{n="LayoutID";e={$_.Groups["reportLayoutID"]}};$reportLayoutIDList |
Export-csv reportLayoutIDList.csv;
The problem is your linebreaks. In windows, linebreaks are CRLF (\r\n) while in UNIX etc. they're just LF \n.
So either you need to modify the input to only use LF or you need to replace \n with \r\n in your regex.
As #briantist mentioned, using \r?\n will match either way.
Thank you to both Frode F and briantist.
This is the regex pattern that worked in Powershell:
$pattern2 = '\d{4}\/\d{2}\/\d{2}.*]\s(?<reportHash>.*):.*Start.*[\r?\n]*.*[\r?\n].*ReportLayoutID=(?<reportLayoutID>\d+)';

Powershell regex group replacing

I want to replace some text in every script file in folder, and I'm trying to use this PS code:
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | ForEach-Object { (Get-Content $_.fullname) -replace $pattern, 'replace text' | Set-Content $_.fullname }
But I have no idea how to keep first part of expression, and just replace the second one. Any idea how can I do this? Thanks.
Not sure that provided regex for tables names is correct, but anyway you could replace with captures using variables $1, $2 and so on, and following syntax: 'Doe, John' -ireplace '(\w+), (\w+)', '$2 $1'
Note that the replacement pattern either needs to be in single quotes ('') or have the $ signs of the replacement group specifiers escaped ("`$2 `$1").
# may better replace with $pattern = '(FROM) (?<replacement_place>[a-zA-Z0-9_.]{1,7})'
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | % `
{
(Get-Content $_.fullname) | % `
{ $_-replace $pattern, '$1 replace text' } |
Set-Content $_.fullname -Force
}
If you need to reference other variables in your replacement expression (as you may), you can use a double-quoted string and escape the capture dollars with a backtick
{ $_-replace $pattern, "`$1 replacement text with $somePoshVariable" } |