Get an array of captures from a regex search in PowerShell - regex

Let's say I have this string:
"14 be h90 bfh4"
And I have this regex pattern:
"(\w+)\d"
In PowerShell, how do I get an array with the contents {"h", "bfh"}?

You want to capture one or more alphabets that are followed by a number, hence the regex for what you want to capture would be this,
[a-zA-Z]+(?=\d)
And the powershell code for same will be this,
$str = "14 be h90 bfh4"
$reg = "[a-zA-Z]+(?=\d)"
$spuntext = $str | Select-String $reg -AllMatches |
ForEach-Object { $_.Matches.Value }
echo $spuntext
Disclaimer: I barely know powershell scripting language so you may have to tweak some codes.

A bit shorten version:
#(Select-String "[a-zA-Z]+(?=\d)" -Input "14 be h90 bfh4" -AllMatches).Matches.Value

Multiple ways to skin a cat as demonstrated by the other answers. Yet another way would be by using the [regex] object provided by .Net
$regex = [regex] '([a-z]+)(?=\d+)'
$regex.Matches("14 be h90 bfh4") | Select Value

Related

How Do I change a string in a specific line contained in a file preserving all other lines?

I have a file that contains this information:
Type=OleDll
Reference=*\G{00020430-0000-0000-C000-000000000046}#2.0#0#..\..\..\..\..\..\..\Windows\SysWOW64\stdole2.tlb#OLE Automation
Reference=*\G{7C0FFAB0-CD84-11D0-949A-00A0C91110ED}#1.0#0#..\..\..\..\..\..\..\Windows\SysWOW64\msdatsrc.tlb#Microsoft Data Source Interfaces for ActiveX Data Binding Type Library
Reference=*\G{26C4A893-1B44-4616-8684-8AC2FA6B0610}#1.0#0#..\..\..\..\..\..\..\Windows\SysWow64\Conexion_NF.dll#Zeus Data Access Library 1.0 (NF)
Reference=*\G{9668818B-3228-49FD-A809-8229CC8AA40F}#1.0#0#..\packages\ZeusMaestrosContabilidad.19.3.0\lib\native\ZeusMaestrosContabilidad190300.dll#Zeus Maestros Contables Des (Contabilidad)
I need to change the data between {} characters on line 5 using powershell and save the change preserving all other information in the file.
You can use the -replace operator to perform a regex match and string replacement.
If there is only one pair of {} per line, you can do the following where .*? matches any non-newline character as few as possible. Since by default Get-Content creates an object that is an array of lines, you can access each line by index with [4] being line 5.
$content = Get-Content File.txt
$content[4] = $content[4] -replace '{.*?}','{new data}'
$content | Set-Content File.txt
If there could be multiple {} pairs per line, you will need to be more specific with your regex. A positive lookbehind assertion (?<=) will do.
$content = Get-Content File.txt
$content[4] = $content[4] -replace '(?<=Reference=\*\\G){.*?}','{newest data}'
$content | Set-Content File.txt
For the case when you don't know which line contains the data you want to replace, you will need to be more specific about the data you are replacing.
Get-Content File.txt -replace '{9668818B-3228-49FD-A809-8229CC8AA40F}','{New Data}' | Set-Content
If there are an encoding requirements, consider using the -Encoding parameter on the Get-Content and Set-Content commands.
Try Regex: (?<=(?:.*\n){4}Reference=\*\\G\{)[\w-]+
Demo
If the content of the {} is always the same you can do this:
(Get-Content $yourfile) -replace $regex, ('{9668818B-3228-49FD-A809-8229CC8AA40F}') | Set-Content $newValue;
One solution :
$Content=Get-Content "C:\temp\test.txt"
$Row5Splited=$Content[4].Split("{}".ToCharArray())
$Content[4]="{0}{1}{2}" -f $Row5Splited[0], "{YOURNEWVALUE}", $Row5Splited[2]
$Content | Out-File "C:\temp\test2.txt"
One approach would be to find,
(.*Reference=\*\\G{)[^\r\n}]+
and replace with,
$1any_thing_you_like_to_replace_with
RegEx Circuit
jex.im visualizes regular expressions:
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.

Extract string from text file via Powershell

I have been trying to extract certain values from multiple lines inside a .txt file with PowerShell.
Host
Class
INCLUDE vmware:/?filter=Displayname Equal "server01" OR Displayname Equal "server02" OR Displayname Equal "server03 test"
This is what I want :
server01
server02
server03 test
I have code so far :
$Regex = [Regex]::new("(?<=Equal)(.*)(?=OR")
$Match = $Regex.Match($String)
You may use
[regex]::matches($String, '(?<=Equal\s*")[^"]+')
See the regex demo.
See more ways to extract multiple matches here. However, you main problem is the regex pattern. The (?<=Equal\s*")[^"]+ pattern matches:
(?<=Equal\s*") - a location preceded with Equal and 0+ whitespaces and then a "
[^"]+ - consumes 1+ chars other than double quotation mark.
Demo:
$String = "Host`nClass`nINCLUDE vmware:/?filter=Displayname Equal ""server01"" OR Displayname Equal ""server02"" OR Displayname Equal ""server03 test"""
[regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value}
Output:
server01
server02
server03 test
Here is a full snippet reading the file in, getting all matches and saving to file:
$newfile = 'file.txt'
$file = 'newtext.txt'
$regex = '(?<=Equal\s*")[^"]+'
Get-Content $file |
Select-String $regex -AllMatches |
Select-Object -Expand Matches |
ForEach-Object { $_.Value } |
Set-Content $newfile
Another option (PSv3+), combining [regex]::Matches() with the -replace operator for a concise solution:
$str = #'
Host
Class
INCLUDE vmware:/?filter=Displayname Equal "server01" OR Displayname Equal "server02" OR Displayname Equal "server03 test"
'#
[regex]::Matches($str, '".*?"').Value -replace '"'
Regex ".*?" matches all "..."-enclosed tokens; .Value extracts them, and -replace '"' strips the " chars.
It may be not be obvious, but this happens to be the fastest solution among the answers here, based on my tests - see bottom.
As an aside: The above would be even more PowerShell-idiomatic if the -match operator - which only looks for a (one) match - had a variant named, say, -matchall, so that one could write:
# WISHFUL THINKING (as of PowerShell Core 6.2)
$str -matchall '".*?"' -replace '"'
See this feature suggestion on GitHub.
Optional reading: performance comparison
Pragmatically speaking, all solutions here are helpful and may be fast enough, but there may be situations where performance must be optimized.
Generally, using Select-String (and the pipeline in general) comes with a performance penalty - while offering elegance and memory-efficient streaming processing.
Also, repeated invocation of script blocks (e.g., { $_.Value }) tends to be slow - especially in a pipeline with ForEach-Object or Where-Object, but also - to a lesser degree - with the .ForEach() and .Where() collection methods (PSv4+).
In the realm of regexes, you pay a performance penalty for variable-length look-behind expressions (e.g. (?<=EQUAL\s*")) and the use of capture groups (e.g., (.*?)).
Here is a performance comparison using the Time-Command function, averaging 1000 runs:
Time-Command -Count 1e3 { [regex]::Matches($str, '".*?"').Value -replace '"' },
{ [regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value} },
{ [regex]::Matches($str, '\"(.*?)\"').Groups.Where({$_.name -eq '1'}).Value },
{ $str | Select-String -Pattern '(?<=Equal\s*")[^"]+' -AllMatches | ForEach-Object{$_.Matches.Value} } |
Format-Table Factor, Command
Sample timings from my MacBook Pro; the exact times aren't important (you can remove the Format-Table call to see them), but the relative performance is reflected in the Factor column, from fastest to slowest.
Factor Command
------ -------
1.00 [regex]::Matches($str, '".*?"').Value -replace '"' # this answer
2.85 [regex]::Matches($str, '\"(.*?)\"').Groups.Where({$_.name -eq '1'}).Value # AdminOfThings'
6.07 [regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value} # Wiktor's
8.35 $str | Select-String -Pattern '(?<=Equal\s*")[^"]+' -AllMatches | ForEach-Object{$_.Matches.Value} # LotPings'
You can modify your regex to use a capture group, which is indicated by the parentheses. The backslashes just escape the quotes. This allows you to just capture what you are looking for and then filter it further. The capture group here is automatically named 1 since I didn't provide a name. Capture group 0 is the entire match including quotes. I switched to the Matches method because that encompasses all matches for the string whereas Match only captures the first match.
$regex = [regex]'\"(.*?)\"'
$regex.matches($string).groups.where{$_.name -eq 1}.value
If you want to export the results, you can do the following:
$regex = [regex]'\"(.*?)\"'
$regex.matches($string).groups.where{$_.name -eq 1}.value | sc "c:\temp\export.txt"
An alterative reading the file directly with Select-String using Wiktor's good RegEx:
Select-String -Path .\file.txt -Pattern '(?<=Equal\s*")[^"]+' -AllMatches|
ForEach-Object{$_.Matches.Value} | Set-Content NewFile.txt
Sample output:
> Get-Content .\NewFile.txt
server01
server02
server03 test

powershell regex select string in variable

I am trying to create a script that select the four numbers that the company computer have in the host name.
I have tested the regex '\d{4}' in a regex web site, and it works fine to select the four numbers. but when using it with powershell y only get the $true or $false.
I need that the 4 numbers are keept in a variable for later use but i havent achieved it.
any ideas??
$machinename = "mac0016w701"
$test = $machinename -match '\d{4}'
$test2= Select-String -Pattern '\d{4}' -inputobject $machinename
$test2
-match is an operator which returns true/false, so you can use it in tests. If you want the values from the regex, it sets the magic variable $Matches, e.g.
PS D:\> 'computer1234' -match '\d{4}'
True
PS D:\> $matches[0]
1234
Alternately, you could use:
[regex]::Matches('computer1234', '\d{4}').Value

How do I capture a value using Select-String and regex matching in PowerShell?

I'm trying to access the capture from a regex match using Select-String. Here's an example of the text.
'dataarea' - closed, 10,933 rows are stored
I would like to capture the number of rows (10,933).
PS> Select-String -Path $foo -Pattern 'closed,\s\d+[.,\d.*]{1,}' `
| %{ $_.Matches } `
| %{ $_.Value } > $output
I tried different regexes, but can't figure out how to capture just the number. This also captures the opening pattern, closed.
Here's another possibility:
'(?<=closed.*)[0-9,]+'
so far, there's no guaranteed that comma will always be there, or there will always be only one.
This works with or without coma to separate 1000's:
$regex = '\d+[,|]+\d+'

Is there a shorter way to pull groups out of a Powershell regex?

In PowerShell I find myself doing this kind of thing over and over again for matches:
some-command | select-string '^(//[^#]*)' |
%{some-other-command $_.matches[0].groups[1].value}
So basically - run a command that generates lines of text, and for each line I want to run a command on a regex capture inside the line (if it matches). Seems really simple. The above works, but is there a shorter way to pull out those regex capture groups? Perl had $1 and so on, if I remember right. Posh has to have something similar, right? I've seen "$matches" references on SO but can't figure out what makes that get set.
I'm very new to PowerShell btw, just started learning.
You can use the -match operator to reformulate your command as:
some-command | Foreach-Object { if($_ -match '^(//[^#]*)') { some-other-command $($matches[1])}}
Named Replacement
'foo bar' -replace '(?<First>foo).+', '${First}'
Returns: foo
Unnamed Replacement
'foo bar' -replace '(foo).+(ar)', '$2 z$2 $1'
Returns: ar zar foo
You could try this:
Get-Content foo.txt | foreach { some-othercommand [regex]::match($_,'^(//[^#]*)').value }