Powershell -creplace doesn't find Expression at end of line - regex

I'm trying to find and replace some text at the end of line with Powershell. (ascii, txt, windows) I need to do this with a given script, which is already used for string replace:
$inputText = [system.IO.File]::ReadAllText("Text.txt")
$regex = '\\DE$|\DE_02'
$regex > test.txt
$th = [system.IO.File]::ReadAllText("test.txt")
foreach($expression in $th) {
if ($expression -eq 'EOF') { break }
$parts = $expression.Split("|")
if ($parts.Count -eq 2) {
$inputText = $InputText -creplace $parts
echo $inputText | out-file "Text_neu.txt" -enc ascii
}
}
The cmdlet works fine so far, but cannot match the end of line ($) -.-
I also tried `r`n instead of $ but didn't work...
When I try like this:
$inputText = [system.IO.File]::ReadAllText("Text.txt")
$inputText.Replace("\DE\`r\`n","\DE_02\`r\`n") | Out-File Text_neu.txt
it's al replaced correctly.
How can I change the existing script so that it will work also?

I am not sure if I understand your script correctly, but I think your problem is, you are replacing on the whole text and not on single rows.
So $ is not the end of a row (\r\n) it will per default match on the end of the string!
You can modify this behaviour by using the inline modifier (?m). This will change the behaviour of $ to match the end of the row.
Try
$regex = '(?m)\\DE$|\DE_02'
as you regular expression.

Related

Powershell - Regex match multiple lines from file

I am able to match and replace multiple lines if the text string is part of the powsershell script:
$regex = #"
(?s)(--match from here--.*?
--up to here--)
"#
$text = #"
first line
--match from here--
other lines
--up to here--
last line
"#
$editedText = ($text -replace $regex, "")
$editedText | Set-Content ".\output.txt"
output.txt:
first line
last line
But if I instead read the text in from a file with Get-Content -Raw, the same regex fails to match anything.
$text = Get-Content ".\input.txt" -Raw
input.txt:
first line
--match from here--
other lines
--up to here--
last line
output.txt:
first line
--match from here--
other lines
--up to here--
last line
Why is this? What can I do to match the text read in from input.txt? Thanks in advance!
Using a here-string the code depends on the kind of newline characters used by the .ps1 file. It won't work if it doesn't match the newline characters used by the input file.
To remove this dependency, define a RegEx that uses \r?\n to match all kinds of newlines:
$regex = "(?s)(--match from here--.*?\r?\n--up to here--)"
$text = Get-Content "input.txt" -Raw
$editedText = $text -replace $regex, ""
$editedText | Set-Content ".\output.txt"
Alternatively you may use a switch based solution, so you can use simpler RegEx pattern:
$include = $true
& { switch -File 'input.txt' -RegEx {
'--match from here--' { $include = $false }
{ $include } { $_ } # Output line if $include equals $true
'--up to here--' { $include = $true }
}} | Set-Content 'output.txt'
The switch -File construct loops over all lines of the input file and passes each one to the match expressions.
When we find the 1st pattern we set an $include flag to $false, which causes the code to skip over all lines until after the 2nd pattern is found, which sets the $include flag back to $true.
Writing $_ on its own causes the current line to be outputted.
We pipe to Set-Content to reduce memory footprint of the script. Instead of reading all lines into a variable in memory, we use a streaming approach where each processed line is immediately passed to Set-Content. Note that we can't pipe directly from a switch block, so as workaround we wrap the switch inside a script block (& { ... } creates and calls the script block).
The idea has been adopted from this GitHub comment.

Powershell script to replace link:lalala.html[lalala] with xref:lalala.adoc[lalala] capture pattern and replace recursively

I have a folder full of text documents in .adoc format that have some text in them. The text is following: link:lalala.html[lalala]. I want to replace this text with xref:lalala.adoc[lalala]. So, basically, just replace link: with xref:, .html with .adoc, leave all the rest unchanged.
But the problem is that lalala can be anything from a word to ../topics/halva.html.
I definitely know that I need to use regex patterns, I previously used similar script. A replace directive wrapped in an object:
Get-ChildItem -Path *.adoc -file -recurse | ForEach-Object {
$lines = Get-Content -Path $PSItem.FullName -Encoding UTF8 -Raw
$patterns = #{
'(\[\.dfn \.term])#(.*?)#' = '$1_$2_' ;
}
$option = [System.Text.RegularExpressions.RegexOptions]::Singleline
foreach($k in $patterns.Keys){
$pat = [regex]::new($k, $option)
$lines = $pat.Replace($lines, $patterns.$k)
}
$lines | Set-Content -Path $PSItem.FullName -Encoding UTF8 -Force
}
Looks like I need a different script since the new task cannot be added as just another object. I could've just replaced each part separately, using two objects: replace link: with xref:, then replace .html with .adoc.
But this can interfere with other links that end with .html and don't start with link:. In the text, absolute links usually don't have link: in the beginning. They always start with http:// or https://. And they still may or may not end with .html. So the best idea is to take the whole string link:lalala.html[lalala] and try to replace it with xref:lalala.adoc[lalala].
I need the help of someone who knows regex and PowerShell, please this would save me.
As a pattern, you might use
\blink:(.+?)\.html(?=\[[^][]*])
\blink: Match link:
(.+?) Capture 1+ chars as least as possbile in group 1
\.html match .html
(?=\[[^][]*]) Assert from an opening till closing square bracket at the right
Regex demo
In the replacement use group 1 using $1
xref:$1.adoc
Example
$Strings = #("link:lalala.html[lalala]", "link:../topics/halva.html[../topics/halva.html]")
$Strings -replace "\blink:(.+?)\.html(?=\[[^][]*])",'xref:$1.adoc'
Output
xref:lalala.adoc[lalala]
xref:../topics/halva.adoc[../topics/halva.html]

Replace text between two string powershell

I have a question which im pretty much stuck on..
I have a file called xml_data.txt and another file called entry.txt
I want to replace everything between <core:topics> and </core:topics>
I have written the below script
$test = Get-Content -Path ./xml_data.txt
$newtest = Get-Content -Path ./entry.txt
$pattern = "<core:topics>(.*?)</core:topics>"
$result0 = [regex]::match($test, $pattern).Groups[1].Value
$result1 = [regex]::match($newtest, $pattern).Groups[1].Value
$test -replace $result0, $result1
When I run the script it outputs onto the console it doesnt look like it made any change.
Can someone please help me out
Note: Typo error fixed
There are three main issues here:
You read the file line by line, but the blocks of texts are multiline strings
Your regex does not match newlines as . does not match a newline by default
Also, the literal regex pattern must when replacing with a dynamic replacement pattern, you must always dollar-escape the $ symbol. Or use simple string .Replace.
So, you need to
Read the whole file in to a single variable, $test = Get-Content -Path ./xml_data.txt -Raw
Use the $pattern = "(?s)<core:topics>(.*?)</core:topics>" regex (it can be enhanced in case it works too slow by unrolling it to <core:topics>([^<]*(?:<(?!</?core:topics>).*)*)</core:topics>)
Use $test -replace [regex]::Escape($result0), $result1.Replace('$', '$$') to "protect" $ chars in the replacement, or $test.Replace($result0, $result1).

How to use regex to remove everything except certain "key"/"character containing"

Running my code gives me this output in a txt file:
19:27:28.636 ASSOS\032AB5601\0223-\032312DEEE8EB423._http._tcp.local. can
be reached at ASSOS-032DEEE8EB423.local.:80 (interface 1)
So I just want to parse out string "ASSOS-032DEEE8EB423.local" and remove everything else from the txt file. I can't figure out how to use regex to do so to remove everything except string containing ASSOS-. So the thing is that the string will always contain ASSOS- but the rest is always changing to different numbers. So I'm trying to always be able to get ASSOS-XXXXXXXXXXX.local
This is how I'm trying to do:
$string = 'Get-Content C:\MyFile.Txt'
$pattern = ''
$string -replace $pattern, ' '
It's just that I don't know so much about regex and how to write it to parse out string containing "ASSOS-" and remove everything after ASSOS-XXXXXXXXXXX.local
I would pipe the file content to Select-String and return the values of matches for a string starting with "ASSOS-", ending with "local" and having whatever non-whitespace characters in between:
Get-Content test.txt | Select-String -Pattern "ASSOS-\S*local" | ForEach-Object {$_.Matches.Value}
A possible solution:
$str = "19:27:28.636 ASSOS\032AB5601\0223-\032312DEEE8EB423._http._tcp.local. can
be reached at **ASSOS-032DEEE8EB423.local**.:80 (interface 1)"
$str -replace '.*\*\*(.*?)\*\*.*', '$1'
The RegEx .*\*\*(.*?)\*\*.* captures all characters within **...**. The * have to be escaped by a \ to make it work.

Powershell replace exact string

I want to replace a simple string "WEEK." (with a dot) in a text file with the string "TEST"
$LOG= "C:\FILE.TXT"
$A= "TEST"
(Get-Content $LOG) | Foreach { $_ -Replace "WEEK.", $A } | Set-Content $LOG;
The problem is that my file has this content:
WEEK_A WEEK.
And when I run my script the result is:
TESTA TEST
and the result that i want is:
WEEK_A TEST
I try with ^ "WEEK." and "^WEEK.$" but it not worked
Can you help me with the regexp? Thanks
====== EDIT ==================
Ok. I try with
$LOG= "C:\FILE.TXT"
$A= "TEST"
(Get-Content $LOG) | Foreach { $_ -Replace "WEEK\.", $A } | Set-Content $LOG;
and seems its works
The reason why this happened is because you have used pattern WEEK. The dot was a problem: in a regular expression world, the dot means "any character". That's why it was replacing both WEEK_ and WEEK..
When you have added backslash, then the dot was escaped ie. it lost it's special meaning. Thus making it work.