Powershell replace lines stating with given pattern - regex

Background:
I am trying to read a config file and place it in a string variable for further use. For now I've managed to remove the newlines, which was fairly simple. However, I'm having a bit of trouble with removing comments (lines starting with a #).
What I have so far
$var = (Get-Content $HOME/path/to/config.txt -Raw).Replace("`r`n","")
What I've tried
A lot of it is something along the lines of
$var = (Get-Content $HOME/path/to/config.txt -Raw).Replace("#.*?`r`n","").Replace("`r`n","")
( .Replace("(?<=#).*?(?=`r`n)","") , .Replace('^[^#].*?`r`n','') etc)
A lot of the resources I've found have treated how to iteratively read from a file and write back to it or a new one, but what I need is for the result to stay in a variable and for the original file not to be altered in any way (also I'd rather avoid using temp files or even variables if possible). I think there is something fundamental I'm missing about the input to Replace. (Also found this semi-relevant piece when you're using the ConvertFrom-Csv Using Import-CSV in Powershell, ignoring commented lines .)
Sample input:
text;
weird text;
other-sort-of-text;
#commented out possibility;
more-input/with-comment;#comment
Sample output:
text;weird text;other-sort-of-text;more-input/with-comment;
Additional info:
Am going to run this on current builds of Windows 10, locally now I seem to have pwershell version 5.1.14393.693

Split the string at semicolons, remove element starting with a #, then join the result back to a string.
((Get-Content $HOME/path/to/config.txt -Raw).Replace("`r`n","") -split ';' |
Where-Object { $_.Trim() -notlike '#*' }) -join ';'

This might be identical to some of the other responses, but here is how I would do it:
$Data = Get-Content $HOME/path/to/config.txt -Raw
(($Data -split(';')).replace("`r`n",'') | Where-Object { $_ -notlike '^#*' }) -join(';')
Anyway you do it, remember that rn needs to be expanded, so it has to be encased in double quotes, unlike the rest of your characters.

#############solution 1 with convertfrom-string####################
#short version
(gc "$HOME/path/to/config.txt" | ? {$_ -notlike "#*"} | cfs -D ";").P1 -join ";"
#verbose version
(Get-Content "$HOME/path/to/config.txt" | where {$_ -notlike "#*"} | ConvertFrom-String -Delimiter ";").P1 -join ";"
#############Solution 2 with convertfrom-csv#######################
(Get-Content "C:\temp\test\config.txt" | where {$_ -notlike "#*"} | ConvertFrom-csv -Delimiter ";" -Header "P1").P1 -join ";"
#############Solution 3 with split #######################
(Get-Content "C:\temp\test\config.txt" | where {$_ -notlike "#*"} | %{$_.Split(';')[0]}) -join ";"

Related

How can i replace all lines in a file with a pattern using Powershell?

I have a file with lines that i wish to remove like the following:
key="Id" value=123"
key="FirstName" value=Name1"
key="LastName" value=Name2"
<!--key="FirstName" value=Name3"
key="LastName" value=Name4"-->
key="Address" value=Address1"
<!--key="Address" value=Address2"
key="FirstName" value=Name1"
key="LastName" value=Name2"-->
key="ReferenceNo" value=765
have tried the following: `
$values = #('key="FirstName"','key="Lastname"', 'add key="Address"');
$regexValues = [string]::Join('|',$values)
$lineprod = Get-Content "D:\test\testfile.txt" | Select-String $regexValues|Select-Object -
ExpandProperty Line
if ($null -ne $lineprod)
{
foreach ($value in $lineprod)
{
$prod = $value.Trim()
$contentProd | ForEach-Object {$_ -replace $prod,""} |Set-Content "D:\test\testfile.txt"
}
}
The issue is that only some of the lines get replaced and or removed and some remain.
The output should be
key="Id" value=123"
key="ReferenceNo" value=765
But i seem to get
key="Id" value=123"
key="ReferenceNo" value=765
<!--key="Address" value=Address2"
key="FirstName" value=Name1"
key="LastName" value=Name2"-->
Any ideas as to why this is happening or changes to the code above ?
Based on your comment, the token 'add key="Address"' should be changed for just 'key="Address"' then the concatenating logic to build your regex looks good. You need to use the -NotMatch switch so it matches anything but those values. Also, Select-String can read files, so, Get-Content can be removed.
Note, the use of (...) in this case is important because you're reading and writing to the same file in the same pipeline. Wrapping the statement in parentheses ensure that all output from Select-String is consumed before passing it through the pipeline. Otherwise, you would end up with an empty file.
$values = 'key="FirstName"', 'key="Lastname"', 'key="Address"'
$regexValues = [string]::Join('|', $values)
(Select-String D:\test\testfile.txt -Pattern $regexValues -NotMatch) |
ForEach-Object Line | Set-Content D:\test\testfile.txt
Outputs:
key="Id" value=123"
key="ReferenceNo" value=765

Regular expression seems not to work in Where-Object cmdlet

I am trying to add quote characters around two fields in a file of comma separated lines. Here is one line of data:
1/22/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
which I would like to become this:
1/22/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
I began developing my regular expression in a simple PowerShell script, and soon I have the following:
$strData = '1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0'
$strNew = $strData -replace "([^,]*),([^,]*),([^,]*),(.*)",'$1,"$2","$3",$4'
$strNew
which gives me this output:
1/29/2018 0:00:00,"0000000","001B9706BE",1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
Great! I'm all set. Extend this example to the general case of a file of similar lines of data:
Get-Content test_data.csv | Where-Object -FilterScript {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4'
}
This is a listing of test_data.csv:
1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0
This is the output of my script:
1/29/2018 0:00:00,0000000,001B9706BE,1,21,0,1,0,0,0,0,0,0,0,0,0,0,13,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104938428,0016C4C483,1,45,0,1,0,0,0,0,0,0,0,0,0,0,35,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104943875,0016C4B0BC,1,31,0,1,0,0,0,0,0,0,0,0,0,0,25,0,1,0,0,0,0,0,0,0,0,0,0
1/29/2018 0:00:00,104948067,0016C4834D,1,33,0,1,0,0,0,0,0,0,0,0,0,0,23,0,1,0,0,0,0,0,0,0,0,0,0
I have also tried this version of the script:
Get-Content test_data.csv | Where-Object -FilterScript {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",$4"
}
and obtained the same results.
My simple test script has convinced me that the regex is correct, but something happens when I use that regex inside a filter script in the Where-Object cmdlet.
What simple, yet critical, detail am I overlooking here?
Here is my PSVerion:
Major Minor Build Revision
----- ----- ----- --------
5 0 10586 117
You're misunderstanding how Where-Object works. The cmdlet outputs those input lines for which the -FilterScript expression evaluates to $true. It does NOT output whatever you do inside that scriptblock (you'd use ForEach-Object for that).
You don't need either Where-Object or ForEach-Object, though. Just put Get-Content in parentheses and use that as the first operand for the -replace operator. You also don't need the 4th capturing group. I would recommend anchoring the expression at the beginning of the string, though.
(Get-Content test_data.csv) -replace '^([^,]*),([^,]*),([^,]*)', '$1,"$2","$3"'
This seems to work here. I used ForEach-Object to process each record.
Get-Content test_data.csv |
ForEach-Object { $_ -replace "([^,]*),([^,]*),([^,]*),(.*)", '$1,"$2","$3",$4' }
This also seems to work. Uses the ? to create a reluctant (lazy) capture.
Get-Content test_data.csv |
ForEach-Object { $_ -replace '(.*?),(.*?),(.*?),(.*)', '$1,"$2","$3",$4' }
I would just make a small change to what you have in order for this to work. Simply change the script to the following, noting that I changed the -FilterScript to a ForEach-Object and fixed a minor typo that you had on the last item in the regular expression with the quotes:
Get-Content c:\temp\test_data.csv | ForEach-Object {
$_ -replace "([^,]*),([^,]*),([^,]*),(.*)", "`$1,`"`$2`",`"`$3`",`"`$4"
}
I tested this with the data you provided and it adds the quotes to the correct columns.

RegEx required for search-replace using PowerShell

I'm trying to load up a file from a PS script and need to search replace on the basis of given pattern and new values. I need to know what the pattern would be. Here is an excerpt from the file:
USER_IDEN;SYSTEM1;USERNAME1;
30;WINDOWS;Wanner.Siegfried;
63;WINDOWS;Ott.Rudolf;
68;WINDOWS;Waldera.Alicja;
94;WINDOWS;Lanzl.Dieter;
98;WINDOWS;Hofmeier.Erhard;
ReplacerValue: "#dummy.domain.com"
What to be replaced: USERNAME1 column
Expected result:
USER_IDEN;SYSTEM1;USERNAME1;
30;WINDOWS;Wanner.Siegfried#dummy.domain.com;
63;WINDOWS;Ott.Rudolf#dummy.domain.com;
68;WINDOWS;Waldera.Alicja#dummy.domain.com;
94;WINDOWS;Lanzl.Dieter#dummy.domain.com;
98;WINDOWS;Hofmeier.Erhard#dummy.domain.com;
Also, the file can be like this as well:
USER_IDEN;SYSTEM1;USERNAME1;SYSTEM2;USERNAME2;SYSTEM3;USERNAME3;
30;WINDOWS;Wanner.Siegfried;WINDOWS2;Wanner.Siegfried;LINUX;Dev-1;LINUX2;QA1
63;WINDOWS;Ott.Rudolf;WINDOWS2;Ott.Rudolf;LINUX;Dev-2
68;WINDOWS;Waldera.Alicja;
94;WINDOWS;Lanzl.Dieter;WINDOWS4;Lanzl.Dieter;WINDOWS3;Lead1
98;WINDOWS;Hofmeier.Erhard;
In the above examples, I want to seek the values under USERNAMEn columns but there is a possibility that the column row may not be present but the CSV (;) and the pairs will remain same and also the first value is the identifier so it's always there.
I have found the way to start but need to get the pattern:
(Get-Content C:\script\test.txt) |
Foreach-Object {$_ -replace "^([0-9]+;WINDOWS;[^;]+);$", '$#dummy.domain.com;'} |
Set-Content C:\script\test.txt
Edit
I came up with this pattern: ^([0-9]+;WINDOWS;[^;]+);$
It is very much fixed to this particular file only with no more than one Domain-Username pair and doesn't depend on the columns.
I think that using a regex to do this is going about it the hard way. Instead of using Get-Content use Import-Csv which will split your columns for you. You can then use Get-Memeber to identify the USERNAME columns. Something like this:
$x = Import-Csv YourFile.csv -Delimiter ';'
$f = #($x[0] | Get-Member -MemberType NoteProperty | Select name -ExpandProperty name | ? {$_ -match 'USERNAME'})
$f | % {
$n = $_
$x | % {$_."$n" = $_."$n" + '#dummy.domain.com'}
}
$x | Export-Csv .\YourFile.csv -Delimiter ';' -NoTypeInformation

Using powershell, in a csv doc, need to iterate and insert a character

So my csv file looks something like:
J|T|W
J|T|W
J|T|W
I'd like to iterate through, most likely using a regex so that after the two pipes and content \|.+{2}, and insert a tab character `t.
I'm assuming I'd use get-content to loop through, but I'm unsure of where to go from there.
Also...just thought of this, it is possible that the line will overrun to the next line, and therefore the two pipes will be on different lines, which I'm pretty sure makes a difference.
-Thanks
Ok, I'll move the comment discussion to an answer since it seems like it is a potentially valid solution:
Import-csv .\test.csv -Delimiter '|' -Header 'One', 'two', 'three' | %{$_.Three = "`t$($_.Three)"; $_} | Export-CSV .\test_result.cs
This works for a file that is known to have 3 fields. For a more generic solution, if you have the ability to determine the number of fields initially being exported to CSV, then:
Import-csv .\test.csv -Delimiter '|' -Header (1..$fieldCount) | %{$_.$fieldCount = "`t$($_.$fieldCount)"; $_} | Export-CSV .\test_result.cs
In PowerShell you can use the -replace operator with a regex e.g.:
$c = Get-Content foo.csv | Foreach {$_ -replace '<regex_here>','new_string'}
$c | Out-File foo.csv -encoding ascii
Note that in new_string you can refer to capture groups using $1 but you'll want to put that string in single quotes so PowerShell won't try to interpret $1 as a variable reference.

powershell get-content ignore newline

When you use set-content Set-Content C:\test.txt "test","test1" by default the two provided strings are separated by a newline, but there is also a newline at the end of the file.
How do you ignore this newline or newline with spaces when using Get-Content?
You can remove empty line like this:
Set-Content C:\test.txt "test",'',"test1"
Get-Content c:\test.txt | ? { $_ }
However, it will remove the string in the middle as well.
Edit: Actually as I tried the example, I noticed that Get-Content ignores the last empty line added by Set-Content.
I think your problem is in Set-Content. If you use workaround with WriteAllText, it will work fine:
[io.file]::WriteAllText('c:\test.txt', ("test",'',"test1" -join "`n"))
You pass a string as second parameter. That's why I first joined the strings via -join and then passed it to the method.
Note: this is not recommended for large files, because of the string concatenation that is not efficient.
Get-Content C:\test.txt | Where-Object {$_ -match '\S'}
it is default behavior that Set-Content adds new line because it lets you set-content with arrays of strings and get them one per line. Anyway Get-Content ignores last "new line" (if there are no spaces behind).
work around for Set-Content:
([byte[]][char[]] "test"), ([byte]13), ([byte]10) ,([byte[]][char[]] "test1") |
Set-Content c:\test.txt -Encoding Byte
or use much simplier [io.file]::WriteAllText
can you specify exact situation (or code)?
for example if you want to ignore last line when getting content it would look like:
$content = Get-Content c:\test.txt
$length = ($content | measure).Count
$content = $content | Select-Object -first ($length - 1)
but if you just do:
"test","test1" | Set-Content C:\test.txt
$content = Get-Content C:\test.txt
$content variable contains two items: "test","test1"