I have a text file which was generated with Powershell.
There is a line that starts with Total Value: $
That line has a dollar amount which contains a thousands separator comma.
I would like to delete that comma, but only in that line.
I have tried using the following however it removes commas where I was not wanting this to
occur.
$Files = Get-ChildItem "C:\Users\User\Summary.txt"
foreach ($file in $Files)
{
$file |
Get-Content |
% {$_ -replace '([\d]),([\d])','$1$2' } |
out-file "C:\Users\User\Summary2.csv" -append -encoding ascii
}
This works however again it is removing comma in areas of the file which I was hoping they could remain.
Any assistance is appreciated.
You can use
foreach ($file in $Files)
{
(Get-Content $file -Raw) -replace '(?m)(?<=^Total\s+Value:\s*\$[\d,]*),','' |
out-file "C:\Users\User\Summary2.csv" -append -encoding ascii
}
See the regex demo.
Details:
Get-Content $file -Raw gets the contents of the file into a single string variable
(?m)(?<=^Total\s+Value:\s*\$[\d,]*), is a regex that matches
(?m)
(?<=^Total\s+Value:\s*\$[\d,]*) - a positive lookbehind that matches a location that is immediately preceded with
^ - start of a line
Total\s+Value: - Total Value: string with any one or more whitespaces between the two words
\$[\d,]* - a $ char and then zero or more digits or commas (a dollar price integer part)
, - a comma (that will be removed since the -replace operator is used with an empty replacement pattern (that can even be removed))
Related
I'm trying to list all the files that contain multiple non-consecutive backslashes in each line.
Here's my script in powershell
Get-ChildItem -Path "D:\config_files" -Include "*.xml","*.txt" -Recurse |
Foreach-Object{
$file = $_.FullName
(Get-Content $file) |
Where-Object{
$_ -match '^(.*)=(")(.*?[^\\])(\\.*)(")(.*)$'
} |
Select-Object -Unique |
ForEach-Object{
Write-Host "$file : $_"
$_ | Out-File -FilePath 'matches.txt' -Append
}
}
Here's my regex
^(.*)=(")(.*?[^\\])(\\.*)(")(.*)$
These are the expected conditions.
starts with characters
followed by ="
contains non-consecutive backslash
followed by "
End with any characters
The regex should detect the text below
<add key="12345 value="\\machine\001\0z991\master" />
<settings file="..\app\service\config\settings.config">
<key="config" value="..\app\bin\config"/>
The problem is it only works in a single line. I already added '$' end the line
i have multiple files like this
TEST:200333
75252
TEST:198234
201756
TEST:201616
274
TEST:200118
934521
TEST:123456
1234
and I want an output like this
200333;75252
198234;201756
201616;274
200118;934521
123456;1234
I tried this code but it doesn't work:
powershell -Command "(gc myFile.txt) -replace 'TEST:(\.+)\r\n(\.+)\r', '\1;\2' | Out-File -encoding ASCII mynewFile.txt"
You can use
powershell -Command "(gc myFile.txt -Raw) -replace '(?m)^TEST:(\d+)\r?\n(\d+)\r?$', '$1;$2' | Out-File -encoding ASCII mynewFile.txt"
Or,
powershell -Command "[system.io.file]::ReadAllText('myFile.txt') -replace '(?m)^TEST:(\d+)\r?\n(\d+)\r?$', '$1;$2' | Out-File -encoding ASCII mynewFile.txt"
See the regex demo. Note the use of -Raw option that slurps the whole file into a single string.
The regex matches
(?m) - multiline mode on
^ - line start
TEST: - some fixed text
(\d+) - Group 1: one or more digits
\r?\n - a CRLF (carriage return + line feed)/LF (line feed) line ending
(\d+) - Group 2: one or more digits
\r? - an optional CR (carriage return)
$ - end of a line.
A no-regex approach that just concatenates every other line:
powershell -Command "gc myFile.txt | % {$i=0}{if($i++%2){"$prev;$_"}else{$prev=$_.Substring(5)}} | sc mynewFile.txt -enc ASCII"
Less code-golfy version:
Get-Content myFile.txt | ForEach-Object -Begin {$i = 0} {
if ($i++ % 2 -ne 0) { # odd line
"$prev;$_" # output "value from previous line;current line"
} else { # even line
$prev = $_.Substring(5) # remember value, cut off the "TEST:"
}
} | Set-Content mynewFile.txt -Encoding ASCII
I would like to conditionally replace a character sequence from strings in a tab delimited file.
In the example below, I want to replace 'apple' with 'orange' when the character sequence starts with 'DEF'. 'xxx' can be any characters or any length (but unlikely to be 'DEF' or apple').
ie:
xxxDEFxapplexxx<tab>DEFxxxapplexxx<tab>xxxDEFxxxapplexxx
to:
xxxDEFxxxapplexxx<tab>DEFxxxorangexxx<tab>xxxDEFxxxapplexxx
Powershell script:
$fileName = "tabfile.txt"
(Get-Content -Path $fileName -Encoding UTF8) |
Foreach-Object { if ($_ -match "^DEF") { $_ -replace "apple", "orange"} else { $_ } } |
Set-Content -Path $fileName
It works fine when each string is separated by a new line (rather than a tab).
Output:
xxxDEFxxxapplexxx
DEFxxxorangexxx
xxxDEFxxxapplexxx
but doesn't work when the strings are separated by tabs (or spaces):
Output:
xxxDEFxxxapplexxx<tab>DEFxxxapplexxx<tab>xxxDEFxxxapplexxx
Thanks.
With help from the comments by iRon and Thomas, I figured out something that works:
Split the string at the tabs to create an array:
Get-Content with -Delimiter "`t" parameter.
Conditional match and replace text on each element:
Foreach-Object { if ($_ -match "^DEF") { $_ -replace "apple", "orange"} else { $_ } } |
Recreate original string by joining each element of the array with a tab character:
Join-String -Separator "`t"
Complete code:
Get-Content -Path "tabfile.txt" -Delimiter "`t"|
Foreach-Object { if ($_ -match "^DEF") { $_ -replace "apple", "orange"} else { $_ } } |
Join-String -Separator "`t"|
Out-File "tabfile.txt"
You do not need any conditional logic here because -replace does it for you implicitly: if there is no match, the string input is returned as is.
The regex you can use is
(?<=(?:^|\t)DEF_)apple
See the regex demo. Add \b word boundary if apple should not be followed with _, letter or digit, or add (?![^\W_]) if it cannot be followed with a digit or letter, but can be followed with _.
Details:
(?<=(?:^|\t)DEF_) - a positive lookbehind that matches a location that is immediately preceded with start of string (^) or (|) a tab (\t) and DEF_
apple - an apple string.
In Powershell, you could use
(Get-Content -Path $fileName -Encoding UTF8) -replace "(?<=(?:^|\t)DEF_)apple", "orange" | Set-Content -Path $fileName
I want to replace some text in every script file in folder, and I'm trying to use this PS code:
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | ForEach-Object { (Get-Content $_.fullname) -replace $pattern, 'replace text' | Set-Content $_.fullname }
But I have no idea how to keep first part of expression, and just replace the second one. Any idea how can I do this? Thanks.
Not sure that provided regex for tables names is correct, but anyway you could replace with captures using variables $1, $2 and so on, and following syntax: 'Doe, John' -ireplace '(\w+), (\w+)', '$2 $1'
Note that the replacement pattern either needs to be in single quotes ('') or have the $ signs of the replacement group specifiers escaped ("`$2 `$1").
# may better replace with $pattern = '(FROM) (?<replacement_place>[a-zA-Z0-9_.]{1,7})'
$pattern = '(FROM [a-zA-Z0-9_.]{1,100})(?<replacement_place>[a-zA-Z0-9_.]{1,7})'
Get-ChildItem -Path 'D:\Scripts' -Recurse -Include *.sql | % `
{
(Get-Content $_.fullname) | % `
{ $_-replace $pattern, '$1 replace text' } |
Set-Content $_.fullname -Force
}
If you need to reference other variables in your replacement expression (as you may), you can use a double-quoted string and escape the capture dollars with a backtick
{ $_-replace $pattern, "`$1 replacement text with $somePoshVariable" } |
I am using a regular expression search to match up and replace some text. The text can span multiple lines (may or may not have line breaks).
Currently I have this:
$regex = "\<\?php eval.*?\>"
Get-ChildItem -exclude *.bak | Where-Object {$_.Attributes -ne "Directory"} |ForEach-Object {
$text = [string]::Join("`n", (Get-Content $_))
$text -replace $RegEx ,"REPLACED"}
Try this:
$regex = New-Object Text.RegularExpressions.Regex "\<\?php eval.*?\>", ('singleline', 'multiline')
Get-ChildItem -exclude *.bak |
Where-Object {!$_.PsIsContainer} |
ForEach-Object {
$text = (Get-Content $_.FullName) -join "`n"
$regex.Replace($text, "REPLACED")
}
A regular expression is explicitly created via New-Object so that options can be passed in.
Try changing your regex pattern to:
"(?s)\<\?php eval.*?\>"
to get singleline (dot matches any char including line terminators). Since you aren't using the ^ or $ metacharacters I don't think you need to specify multiline (^ & $ match embedded line terminators).
Update: It seems that -replace makes sure the regex is case-insensitive so the i option isn't needed.
One should use the (.|\n)+ expression to cross line boundaries
since . doesn't match new lines.