Powershell - How to UpperCase a string found with a Regex [duplicate] - regex

This question already has answers here:
Lambda Expression in Powershell
(3 answers)
Closed 3 years ago.
I am writing a powershell script to parse the HTM file. I need to find all the links file in the file and then uppercase the filepath, filename and extention. (could be 30 or 40 links in any file). The part I'm having trouble with is the 2nd part of the -replace staement below (the 'XXXX' part). The regex WILL find the strings I'm looking for but I can't figure out how to 'replace' that string with a uppercase version, then update the existing file with a new links.
I hope I'm explaining this correctly. Appreciate any assistance that anyone can provide.
This is the code I have so far...
$FilePath = 'C:\WebDev'
$FileName = 'Class.htm'
[regex]$regex='(href="[^.]+\.htm")'
#Will Match the string href="filepath/file.htm"
( Get-Content "$FilePath\$FileName") -replace $regex , 'XXXX' | Set-Content "$FilePath\$FileName";
Final string that gets updated in the existing file should look like this HREF="FILEPATH/FILE.HTM"

Both beatcracker and briantist refer you to this answer, which shows the correct approach. Regex expressions cannot convert to uppercase, so you need to hook into the .NET String.ToUpper() function.
Instead of using -replace, use the .Replace() method on your $regex object (as described in the other answer). You also need the ForEach-Object construct so it gets called for each string in the pipeline. I've split up the last line for readability, but you can keep it on one line if you must.
$FilePath = 'C:\WebDev'
$FileName = 'Class.htm'
[regex]$regex='(href="[^.]+\.htm")'
(Get-Content "$FilePath\$FileName") |
ForEach-Object { $regex.Replace($_, {$args[0].Value.ToUpper()}) } |
Set-Content "$FilePath\$FileName"

Related

Powershell: Regex matching with Get-content -Raw flag results in empty results [duplicate]

This question already has answers here:
How do I match any character across multiple lines in a regular expression?
(26 answers)
Closed 5 months ago.
Solution was adding (?ms) to the front of my regex query
I am trying to search for chunks of text within a file, and preserving the line breaks in a chunk.
When I define my variable as $variable = get-content $fromfile,
my function (below) is able to find the text I'm looking for but it is difficult to parse further due to a lack of line breaks
function FindBetween($first, $second, $importing){
$pattern = "$first(.*?)$second"
$result = [regex]::Match($importing, $pattern).Groups[1].Value
return $result
}
when I define my variable as $variable = get-content $fromfile -raw, the output of my query is blank. I'm able to print the variable, and it does preserve the line breaks.
I run into the same issue regardless of if I add \r\n to the end of my pattern, if I use #() around my variable definition, if I use -Delimiter \n, or any combination of all those.
Whole code is here:
param($fromfile)
$working = get-content $fromfile -raw
function FindBetween($first, $second, $importing){
$pattern = "(?ms)$first(.*?)$second"
$result = [regex]::Match($importing, $pattern).Groups[1].Value
#$result = select-string -InputObject $importing -Pattern $pattern
return $result
}
FindBetween -first "host ####" -second "#### flag 2" -importing $working | Out-File "testresult.txt"
the file I'm testing it against looks like:
#### flag 1 host ####
stuff in between
#### flag 2 server ####
#### process manager ####
As to why I'm doing this:
I'm trying to automate taking a file that has defined sections with titles and outputting the content of those separate sections into a .csv (each section is formatted drastically different from each other). These files are all uniform to each other, containing the same sections and general content.
If you're doing -raw you probably need to change your RegEx to "(?ms)$first(.*?)$second" so that . will match new lines.

How to preserve new line characters when performing a regex match in PowerShell [duplicate]

This question already has answers here:
Why are all newlines gone after PowerShell's Get-Content, Regex, and Set-Content?
(4 answers)
How do I match any character across multiple lines in a regular expression?
(26 answers)
Closed 2 years ago.
The goal is simple:
Take contents of text file
Search for pattern
Save search back to the file
For example, i want to find the first occurrence between # and ##. Following regex works perfectly (\#)(.*?)(?=\#{2}). It finds what I want. However, PowerShell removes all new line characters effectively changing the formatting. So, following input text
#
This
Is
My
Text
##
becomes this
# This Is My Text
How to preserve the formatting?
Here is my PowerShell script
param (
[string]$filename
)
$content = Get-Content -Path $filename
$output = $filename
$regex = [Regex]::new('(\#)(.*?)(?=\#{2})')
$matches = $regex.Matches($content)
Set-Content -Path $output $Matches[0]

PowerShell Regex Bulk Replace Filenames [duplicate]

This question already has answers here:
Quoting -replace & variables
(5 answers)
Closed 4 years ago.
I am trying to replace filenames in a given folder, but using regular expressions as a filter and in the new filenames, in PowerShell.
For example, a file name "CEX-13" should be renamed to "C-0013"; "CEX-14" should change to "C-0014", etc.
I have this, but I get an error that I cannot create a file that already exists.
My current code is:
foreach ($file in get-childitem | where-object {$_.name -match "^CEX-(\d\d)"})
{
rename-item $file -newname ($file.name -replace "^CEX-(\d\d)", "C-00$1")
}
Any help will be greatly appreciated.
You need the dollar in the replacement to get past the PowerShell variable expansion in strings, and stay as a dollar sign as it gets to the regex engine.
Currently "C-00$1" becomes "C-00" and all the files will get the same name.
You need to escape it with a backtick
"C-00`$1"
or use single quotes 'C-00$1'

How can I extract strings from some text file with powershell script?

I wanted to extract some strings from some text files. After some researching for that files, I found some pattern that strings appear in a text file.
I composed a short powershell script by help of google-search. This script receives two parameters (textfile path and extracting keyword) and operates extracting strings from text file.
As finding & extracting the target strings from the file $tpath\temp.txt, this script saves it to another file $tpath\tmpVI.txt.
Set-PSDebug -Trace 2 -step
$txtpath=$args[0]
$exkey=$args[1]
$tfile=gc "$tpath\temp.txt"
$savextracted="$tpath\tmpVI.txt"
$tfile -replace '&', '&' -replace '^.*$exkey', '' -replace '\s.*$', '' -replace '\\.*$','' | out-file "$savextracted" -encoding ascii
But until now, the extracted & saved result has been fault, never wanted strings.
By PS debugging, it seems the regular expressions in the last line make troubles and variable $exkey does so in replace quotation. But I don't know how to fix this. What shall I do?
If you're looking to capture lines that have your match, here's a snippet that solves that problem:
Function Get-Matches
{
Param(
[Parameter(Mandatory,Position=0)]
[String] $Path,
[Parameter(Mandatory,Position=1)]
[String] $Regex
)
#(Get-Content -Path $Path) -match $Regex
}

Replace an entire line of text using powershell and regexp?

I have a programming background, but I am fairly new to both powershell scripting and regexp. Regexp has always eluded me, and my prior projects have never 'forced' me to learn it.
With that in mind I have a file with a line of text that I need to replace. I can not depend on knowing where the line exists, if it has whitespace in front of it, or what the ACTUAL text being replaced IS. I DO KNOW what will preface and preceed the text being replaced.
AGAIN, I will not KNOW the value of "Replace This Text". I will only know what prefaces it "" and what preceeds it "". Edited OP to clarify. Thanks!
LINE OF TEXT I NEED TO REPLACE
<find-this-text>Replace This Text</find-this-text>
POTENTIAL CODE
(gc $file) | % { $_ -replace “”, “” } | sc $file
Get the content of the file, enclose this in parentheses to ensure file is first read and then closed so it doesnt throw an error when trying to save the file.
Iterate through each line, and issue replace statement. THIS IS WHERE I COULD USE HELP.
Save the file by using Set-Content. My understanding is that this method is preferable, because it takes encoding into consideration,like UTF8.
XML is not a line oriented format (nodes may span several lines, just as well as a line may contain several nodes), so it shouldn't be edited as if it were. Use a proper XML parser instead.
$xmlfile = 'C:\path\to\your.xml'
[xml]$xml = Get-Content $xmlfile
$node = $xml.SelectSingleNode('//find-this-text')
$node.'#text' = 'replacement text'
For saving the XML in "UTF-8 without BOM" format you can call the Save() method with a StreamWriter doing The Right Thing™:
$UTF8withoutBOM = New-Object Text.UTF8Encoding($false)
$writer = New-Object IO.StreamWriter ($xmlfile, $false, $UTF8withoutBOM)
$xml.Save($writer)
$writer.Close()
The .* in the regular expression would be considered "greedy" and dangerous by many. If the line that contains this tag and it's data contains nothing else, then there really isn't any significant risk according to my understanding.
$file = "c:\temp\sms.txt"
$OpenTag = "<find-this-text>"
$CloseTag = "</find-this-text>"
$NewText = $OpenTag + "New text" + $CloseTag
(Get-Content $file) | Foreach-Object {$_ -replace "$OpenTag.*$CloseTag", $NewText} | Set-Content $file