I've verified that my regex is correct with this code:
#this is the string where I'm trying to extract everything within the []
$text = "MS14-012[2925418],MS14-029[2953522;2961851]"
$text -match "\[(.*?)\]"
$matches[1]
Output:
True
2925418
I'd like to use Select-String to get my result, like this for example:
$result = $text| Select-String -Pattern $regex
Output:
MS14-012[2925418],MS14-029[2953522;2961851]
What else I've tried:
$result = Select-String -Pattern $regex -InputObject $text
$result = Select-String -Pattern ([regex]::Escape("\[(.*?)\]")) -InputObject $text
And some more variations as well as different kinds of " and ' around the regex and so on. I'm really out of ideas...
Can anyone please tell me why the regex is not matching when I'm using Select-String?
After piping the output to Get-Member I noticed that Select-String returns a MatchInfo object and that I needed to access the MatchInfo.Matches property to get the result. Thanks to Mathias R. Jessen for giving me the hint! ;)
Related
I am trying to scrape data from https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor.
The end goal is to pull the table down that contains the Company, Symbol and Exchange.
I have successfully gained the HTML that I need but I can't pull the data I need from it.
I've used some online RegEx 'helpers' and the string works fine and selects the data I need, but when I try and use the command it doesn't work.
$web = Invoke-WebRequest -uri 'https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor' -UseBasicParsing
$str = ($web.Content).ToString()
[regex]$regex = '<table[\s\S]*?</table>'
$str | Select-String -Pattern $regex -AllMatches
$str > raw.txt; Select-String -Pattern $regex -Path ./raw.txt -AllMatches
I'm expecting to return the whole element but it returns the full string in the piped command and nothing in the -Path command.
I've tried also doing this using a IE Com object.
Rubber ducky effect.
As soon as I asked I figured it out...
$url = 'https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor'
$content = (New-Object System.Net.WebClient).DownloadString($url)
$content -match '<table[\s\S]*?</table>'
$matches
Name Value
---- -----
0 <table width="100%" cellspacing="0" cellpadding="1" class="search-table-data">...
I'm trying to find a line that matches the below pattern in a file. I can see the pattern is output as I expect but it doesn't match.
String in file
"5/29/2019 12:01:03 PM - Sys - Logged Successfully"
Variables
$pattern = "Logged Successfully"
$datePattern = "5/29/2019"
Code - Working - Matches ok
$reponse = select-string -Path $path\$file -Pattern $pattern -allmatches -simplematch
Code - Not Working
$reponse = select-string -Path $path\$file -Pattern "$($datePattern).*$($pattern)" -allmatches -simplematch
Maybe im missing something very simple, any help greatly appreciated.
Remove the -simplematch switch from the code sample that is not working and then it will work. You are disabling a regular expression match while using that switch. See this previous SO answer from Mathias R. Jessen where he explains in more detail.
I’m creating a script that reads a text file and compares the results to an array. It works fine, but I have some records that say they match but they don’t.
For example - TG1032 and TG match according to the select-string script.
Here is my select-string:
$Sel = select-string -pattern $strArrVal -path $txt
Is there a way to alter this to make select-string only match records that are 6 characters long?
I would still like to point out where your pattern is wrong but the solution will most likely be the same regardless. If you are looking to match lines that are exactly 6 characters then you could just use the pattern ^.{6}$.
$strArrVal = "^.{6}$"
Select-String -Pattern $strArrVal -Path $txt
If that is really all you are looking for then regex is not really required. You could do this with Get-Content with similar results
Get-Content $txt | Where-Object{$_.length -eq 6}
I store the output of a defragmentation analysis in a variable, then I try to match a pattern to retrieve a number.
In this following online regex tester, it works fine but in powershell, String -match $pattern returns false.
My code:
$result = Defrag C: /A /V | Out-String
echo $result
$pattern = "fragmenté[^.0-9]*([0-9]+)%"
$result -match $pattern
What am I doing wrong?
I actually had no issue with your code. I just needed to change the match to support my English output. It is possible that Wolfgang Kluge is onto something about the whitespace. However if your output actually matches what you have in the regex tester than i'm not sure what this issue you are having.
For fun I propose this update to your code. This uses ConvertFrom-StringData. I explain the code more in this answer.
$defrag = Defrag C: /A /V | out-string
$hash = (($defrag -split "`r`n" | Where-Object{$_ -match "="}) -join "`r`n" | ConvertFrom-StringData)
$result = New-Object -TypeName PSCustomObject -Property $hash
$result."Quantité totale d'espace fragmenté"
This is of course assuming that your PowerShell is perfectly OK with the accents in the words. On my ISE 3.0 that above code works.
Again... your code was working just fine for me in your question. I also don't think the Out-String is required. I still get positive output. With Out-String I get extra output that includes the entire matched line. Else I just get a boolean. In both (using the following code) I still get a result.
$result = Defrag C: /A /V #| Out-String
$pattern = "fragmented space[^.0-9]*([0-9]+)%"
$result -match $pattern
$Matches[1]
-match works as an array operator which changes how $result is treated. With Out-String $result is a System.String and without it you get System.Object
False Theory
The only way I can get the match to be False is if I am not running PowerShell as an administrator. That is important because if not you will get a message
The disk defragmenter cannot start because you have insufficient priveleges to perform this operation. (0x89000024)
I have this PowerShell script that's main purpose is to search through HTML files within a folder, find specific HTML markup, and replace with what I tell it to.
I have been able to do 3/4 of my find and replaces perfectly. The one I am having trouble with involves a Regular Expression.
This is the markup that I am trying to make my regex find and replace:
<a href="programsactivities_skating.html"><br />
</a>
Here is the regex I have so far, along with the function I am using it in:
automate -school "C:\Users\$env:username\Desktop\schools\$question" -query '(?mis)(?!exclude1|exclude2|exclude3)(<a[^>]*?>(\s| |<br\s?/?>)*</a>)' -replace ''
And here is the automate function:
function automate($school, $query, $replace) {
$processFiles = Get-ChildItem -Exclude *.bak -Include "*.html", "*.HTML", "*.htm", "*.HTM" -Recurse -Path $school
foreach ($file in $processFiles) {
$text = Get-Content $file
$text = $text -replace $query, $replace
$text | Out-File $file -Force -Encoding utf8
}
}
I have been trying to figure out the solution to this for about 2 days now, and just can't seem to get it to work. I have determined that problem is that I need to tell my regex to account for Multiline, and that's what I'm having trouble with.
Any help anyone can provide is greatly appreciate.
Thanks in Advance.
Get-Content produces an array of strings, where each string contains a single line from your input file, so you won't be able to match text passages spanning more than one line. You need to merge the array into a single string if you want to be able to match more than one line:
$text = Get-Content $file | Out-String
or
[String]$text = Get-Content $file
or
$text = [IO.File]::ReadAllText($file)
Note that the 1st and 2nd method don't preserve line breaks from the input file. Method 2 simply mangles all line breaks, as Keith pointed out in the comments, and method 1 puts <CR><LF> at the end of each line when joining the array. The latter may be an issue when dealing with Linux/Unix or Mac files.
I don't get what it is you're trying to do with those Exclude elements, but I find multi-line regex is usually easier to construct in a here-string:
$text = #'
<a href="programsactivities_skating.html"><br />
</a>
'#
$regex = #'
(?mis)<a href="programsactivities_skating.html"><br />
\s+?</a>
'#
$text -match $regex
True
Get-Content will return an array of strings, you want to concatenate the strings in question to create one:
function automate($school, $query, $replace) {
$processFiles = Get-ChildItem -Exclude *.bak -Include "*.html", "*.HTML", "*.htm", "*.HTM" -Recurse -Path $school
foreach ($file in $processFiles) {
$text = ""
$text = Get-Content $file | % { $text += $_ +"`r`n" }
$text = $text -replace $query, $replace
$text | Out-File $file -Force -Encoding utf8
}
}