PowerShell replace unknown 3 letter word after operator [duplicate] - regex

I have a simple textfile and I need a powershell script to replace some parts of the file content.
My current script is the following:
$content = Get-Content -path "Input.json"
$content -Replace '"(\d+),(\d{1,})"', '$1.$2' | Out-File "output.json"
Is it possible to write it in one line without the content variable, like this?
Get-Content -path "Input.json" | ??? -Replace '"(\d+),(\d{1,})"', '$1.$2' | Out-File "output.json"
I don't know how I can use the output of the first get-content commandlet in the second command without the $content variable? Is there an automatic powershell variable
Is it possible to do more replacements than one in a pipeline.
Get-Content -path "Input.json" | ??? -Replace '"(\d+),(\d{1,})"', '$1.$2' | ??? -Replace 'second regex', 'second replacement' | Out-File "output.json"

Yes, you can do that in one line and don't even need a pipeline, as -replace works on arrays like you would expect it to do (and you can chain the operator):
(Get-Content Input.json) `
-replace '"(\d+),(\d{1,})"', '$1.$2' `
-replace 'second regex', 'second replacement' |
Out-File output.json
(Line breaks added for readability.)
The parentheses around the Get-Content call are necessary to prevent the -replace operator being interpreted as an argument to Get-Content.

Is it possible to write it in one line without the content variable, like this?
Yes: use ForEach-Object (or its alias %) and then $_ to reference the object on the pipeline:
Get-Content -path "Input.json" | % { $_ -Replace '"(\d+),(\d{1,})"', '$1.$2' } | Out-File "output.json"
Is it possible to do more replacements than one in a pipeline.
Yes.
As above: just adding more Foreach-Object segments.
As -replace returns the result, they can be chained in a single expression:
($_ -replace $a,$b) -replace $c,$d
I suspect the parentheses are not needed, but I think they make it easier to read: clearly
more than a few chained operators (especially if the match/replacements are non-trivial) will
not be clear.

Related

powershell -replace regex

I have the following script which I try to run on various html files
$files = $args[0];
$string1 = $args[1];
$string2 = $args[2];
Write-Host "Replace $string1 with $string2 in $files";
gci -r -include "$files" |
foreach-object { $a = $_.fullname; ( get-content $a ) |
foreach-object {
$_ -replace "%string1" , "$string2" |
set-content $a
}
}
in an attempt to edit this line found in all the files.
<tr><td>TestCase</td></tr>
I call the script from powershell like this (it's called replace.ps1)
./replace *.html sampleTest myNewTest
but instead of changing sampleTest.html to myNewTest.html
it deletes everything in the doc except for the last line,
leaving all of the files like so:
/html
in fact, no matter what arguments I pass in this seems to happen.
Can anyone explain this/help me understand why it's happening?
Your loop structure is to blame here. You need to have the Set-Content located outside the loop. Your code is overwriting the file at every pass.
....
foreach-object { $a = $_.fullname; ( get-content $a ) |
foreach-object {
$_ -replace "$string1" , "$string2" |
} | set-content $a
}
It also might have been a typo but you had "%string1" before which, while syntactically correct, what not what you intended.
Could also have used Add-Content but that would mean you have to erase the file first. set-content $a used at the end of the pipe is more intuitive.
Your example is not one that uses regex. You could have used $_.replace($string1,$string2) with the same results.

PowerShell regex filter files

I am trying to filter files using PowerShell, and I need to insert a new line character in between </tr><tr> to break those into separate lines and then remove all the lines that match <tr> lots of characters BTE lots of characters </tr> and save the files in place.
Forgive me, as I am new to PowerShell, and this is simple in SED, but I must use PowerShell. This is what I have but could be completely wrong.
Get-Content *.htm | Foreach-Object {$_ -replace '</tr><tr>', '</tr>\r\n<tr>'; $_}f
Get-Content *.htm | Foreach-Object {$_ -replace '<tr>.*BTE.*</tr>', ''; $_}
So it just sounds like you need to save your changes back to the original files. Also we should just be able to make these changes in one pass instead of reading the files twice.
Get-ChildItem *.htm | Foreach-Object {
$singleFileName = $_.FullName
(Get-Content $singleFileName) -replace '</tr><tr>', "</tr>`r`n<tr>" -replace '<tr>.*BTE.*</tr>' | Set-Content $singleFileName
}
You can't read and write to the same file in the pipe. We place (Get-Content $singleFileName) in parenthesis so that the whole file is read at once.
Get-Content $singleFileName | Set-Content $singleFileName
As each line is passed down the pipe the file is left open so that Set-Content can't write to it.
I don't think you have to insert the line break if RegEx is able to capture the group like this.
Get-ChildItem *.htm | Foreach-Object {
$singleFileName = $_.FullName
([RegEx]::Matches((Get-Content $singleFileName),'<tr>.*?</tr>')).Value|?{$_ -notlike '<tr>*BTE*</tr>'} | Set-Content $singleFileName
}

PowerShell regex export match contents

I am learning regex and am trying to get a better understanding by using a text file with the value $100,000 in it. What I am trying to do is to search the text file for the string "$100,000" and if it is there export the value out into a new CSV. this is what I'm using so far.
[io.file]::readalltext("c:\utilities\notes_$datetime.txt") -match("[$][0-9][0-9][0-9],[0-9][0-9][0-9]") | Out-File C:\utilities\amount.txt -Encoding ascii -Force
Which returns true. Can someone point me in the right direction as to grabbing the string value that it finds into a new CSV?
many thanks!
You're reading the file into a single string, not an array of lines, so you should use the Select-String -AllMatches instead of the -match operator:
[IO.File]::ReadAllText("c:\utilities\notes_$datetime.txt") |
Select-String '\$\d{3},\d{3}' -AllMatches |
% { $_.Matches.Groups.Value } |
Out-File C:\utilities\amount.txt -Encoding ascii -Force
As a side note, using Get-Content -Raw would be slightly more PoSh than using .Net methods, although .Net methods provide better performance.
Get-Content "c:\utilities\notes_$datetime.txt" -Raw |
Select-String '\$\d{3},\d{3}' -AllMatches |
% { $_.Matches.Groups.Value } |
Out-File C:\utilities\amount.txt -Encoding ascii -Force
I prefer to use [regex]::match for that:
$x = 'text bla $100,000 text text'
[regex]::Match($x,"\$[\d]{3},[\d]{3}").Groups[0].Value
I also changed the expression a little bit ($ followed by 3 numbers, followed by a "," and another 3 numbers).
So your script could look like this:
$fileContent = Get-Content "c:\utilities\notes_$datetime.txt"
[regex]::Match($fileContent,"\$[\d]{3},[\d]{3}").Groups[0].Value | Out-File C:\utilities\amount.txt -Encoding ascii -Force
Why not use the Select-String cmdlet - far easier:
Select-String .\infile.csv -pattern '\$[\d]{3},[\d]{3}' | Select Line | Out-File outfile.txt
You can then process multiple files like so:
Get-Childitem *.csv | Select-String -pattern '\$[\d]{3},[\d]{3}' | Select Line | Out-File outfile.txt
The Select-String has the following properties:
Line - the line where the regex found a match
LineNumber - the line number in the file where the match was found
Filename - the name of the file the match was found in

Powershell ignoring look behind regular expression to return entire line

A simple enough question I hope.
I have a text log file that includes the following line:
123,010502500114082000000009260000000122001T
I want to search through the log file and return the "00000000926" section of the above text. So I wrote a regular expression:
(?<=123.{17}).{11}
So when the look behind text equals '123' with 17 characters, return the next 11. This works fine when tested on online regex editors. However in Powershell the entire line is returned instead of the 11 characters I want and I can't understand why.
$InputFile = get-content logfile.log
$regex = '(?<=123.{17}).{11}'
$Inputfile | select-string $regex
(entire line is returned).
Why is powershell returning the entire line?
Don't discount Select-String just yet. Like Briantist says it is doing what you want it to but you need to extract the data you actually want in one of two ways. Select-String returns Microsoft.PowerShell.Commands.MatchInfo objects and not just raw strings. Also we are going to use Select-String's ability to take file input directly.
$InputFile = "logfile.log"
$regex = '(?<=123.{17}).{11}'
Select-string $InputFile -Pattern $regex | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
Of if you have at least PowerShell 3.0
(Select-string $InputFile -Pattern $regex).Matches.Value
Which gives in both cases
00000009260
It's because you're using Select-String which returns the line that matches (think grep).
$InputFile = get-content logfile.log | ForEach-Object {
if ($_ -match '(?<=123.{17})(.{11})') {
$Matches[1]
}
}
Haven't tested this, but it should work (or something similar).
You don't really need the lookaround regex for that:
$InputFile = get-content logfile.log
$InputFile -match '123.{28}' -replace '123.{17}(.{11}).+','$1'

Use powershell ForEach-Object to match and replace string with regex

I use the below pipeline to read a file and replace a line in it and save it to another file, but found that the string in target file is not replaced, it's still the old one.
original line is : name-1a2b3c4d
new line should be: name-6a5e4r3h
(Get-Content "test1.xml") | ForEach-Object {$_ -replace '^name-.*$', "name-6a5e4r3h"} | Set-Content "test2.xml"
Anything missing there?
One thing you're missing is that the -replace operator works just fine on an array, which means you don't need that foreach-object loop at all:
(Get-Content "test1.xml") -replace '^name-.*$', 'name-6a5e4r3h' | Set-Content test2.xml
You're not changing the $_ variable.
You might try:
$lines = Get-Content $file
$len = $lines.count
for($i=0;$i-lt$len;$i++){
$lines[$i] = $lines[$i] -replace $bad, $good
}
$lines > $outfile