I need to edit txt file using PowerShell. The problem is that I need to apply changes for the string only if the remaining part of the string matches some pattern. For example, I need to change 'specific_text' to 'other_text' only if the line ends with 'pattern':
'specific_text and pattern' -> changes to 'other_text and pattern'
But if the line doesn't end with pattern, I don't need to change it:
'specific_text and something else' -> no changes
I know about Replace function in PowerShell, but as far as I know it makes simple change for all matches of the regex. There is also Select-String function, but I couldn't combine them properly. My idea was to make it this way:
((get-content myfile.txt | select-string -pattern "pattern") -Replace "specific_text", "other_text") | Out-File myfile.txt
But this call rewrites the whole file and leaves only changed lines.
You may use
(get-content myfile.txt) -replace 'specific_text(?=.*pattern$)', "other_text" | Out-File myfile.txt
The specific_text(?=.*pattern$) pattern matches
specific_text - some specific_text...
(?=.*pattern$) - not immediately followed with any 0 or more chars other than a newline as many as possible and then pattern at the end of the string ($).
I'm attempting to match events where the only way to tell when an event starts and ends is with the header or first value in the multi line event (e.g. START--). Basically, using the header as an ending anchor to get the whole event. Also, the last event will end at the end of the file, so there's no anchor for that one. I'm not quite sure how to make this work.
Event Example (There's no spaces between the lines)
START--random stuff here
more random stuff on this new line
more stuff and things
START--some random things
additional random things
blah blah
START--data data more data
START--things
blah data
$FileContent | select-string '^START--(.*?)^START--' -AllMatches | Foreach {$_.Matches} | Foreach {$_.Value}
You may read in the file into a single variable (it can be done by passing -Raw option to Get-Content, for example) and split it at the start of lines starting with START-- but the first line:
$contents = Get-Content 'your_file_path' -Raw
$contents -split '(?m)^(?!\A)(?=START--)'
It will yield
Regex details
(?m) - the multiline option is ON
^ - now, it matches start of lines due to (?m)
(?!\A) - not the start of the whole string/text
(?=START--) - the location that is immediately followed with START-- substring.
I have been trying to extract certain equal to 40 values get the sixth last word from multiple lines inside a .txt file with PowerShell.
I have code so far :
$file = Get-Content 'c:\temp\file.txt'
$Array = #()
foreach ($line in $file)
{
$Array += $line.split(",")[6]
}
$Array
$Array | sc "c:\temp\export2.txt"
Txt file : (may be duplicate lines such as hostname01)
4626898,0,3,0,POL,INCR,hostname01,xx,1549429809,0000000507,1549430316,xxx,0,40,1,xxxx,51870834,5040,100
4626898,0,3,0,POL,INCR,hostname02,xx,1549429809,0000000507,1549430316,xxx,0,15,1,xxxx,51870834,5040,100
4626898,0,3,0,POL,INCR,hostname03 developer host,xx,1549429809,0000000507,1549430316,xxx,0,40,1,xxxx,51870834,5040,100
4626898,0,3,0,POL,INCR,hostname01,xx,1549429809,0000000507,1549430316,xxx,0,40,1,xxxx,51870834,5040,100
This is what I want :
hostname01
hostname02
hostname03 developer host
This is not a fast solution, but a convenient and flexible one:
Since your text file is effectively a CSV file, you can use Import-Csv.
Since your data is missing is a header row (column names), which we can supply to Import-Csv via its -Header parameter.
Since you're interested in columns number 7 (hostnames) and 14 (the number whose value should be 40), we need to supply column names (of our choice) for columns 1 through 14.
Import-Csv conveniently converts the CSV rows to (custom) objects, whose properties you can query with Where-Object and selectively extract with Select-Object; adding -Unique suppresses duplicate values.
To put it all together:
Import-Csv c:\temp\file.txt -Header (1..14) |
Where-Object 14 -eq 40 |
Select-Object -ExpandProperty 7 -Unique
For convenience we've named the columns 1, 2, ... using a range expression (1..14), but you're free to use descriptive names.
Assuming that c:\temp\file.txt contains your sample data, the above yields:
hostname01
hostname03 developer host
To output to a file, pipe the above to Set-Content, as in your question:
... | Set-Content c:\temp\export2.txt
If the desired field is always the 6th in the line it is easier to split each line and fetch the 6th member:
Get-Content 'c:\temp\file.txt' | Foreach-Object {($_ -split ',')[6]} | Select-Object -Unique
You could use a non-capturing group to look through the string for the correct format and reference the name of your 6 element with the 1st capture group $1:
(?:\d+,\d,\d,\d,[A-Z]+,[A-Z]+,)([a-zA-Z 0-9]+)
Demo here
(?: ) - Specifies a non-capture group (meaning it's not referenced via $1, or $2 like you normally would with a capture group
\d+, (I won't repeat all of these, but) looking for a one or more digits followed by a literal ,.
[A-Z]+, - Finds an all capital letter string, followed by a literal , (this occurs twice).
([a-zA-Z 0-9]+) - The capture group you're looking for, $1, that will capture all characters a-z, A-Z, spaces, and digits up until a character not in this set (in this case, a comma). Giving you the text you're looking for.
Below should work with what you are trying to do
Get-Content 'c:\temp\file.txt' | %{
$_.Split(',')[6]
}| select -Unique
I have several files in a folder, those are .xml files.
I want to get a value from those files.
A line in the file, could look like this:
<drives name="Virtual HD ATA Device" deviceid="\\.\PHYSICALDRIVE0" interface="IDE" totaldisksize="49,99">
What i'm trying to do is get the value 49,99 in this case.
I am able to get the line out of the file with:
$Strings = Select-String -Path "XML\*.xml" -Pattern totaldisksize
foreach ($String in $Strings) {
Write-Host "Line is" $String
}
But getting just the value in "" i don't get how. I've also played around with
$Strings.totaldisksize
But no dice.
Thanks in advance.
You can do this in one line as follows:
$(select-string totaldisksize .\XML\*.xml).line -replace '.*totaldisksize="(\d+,\d+)".*','$1'
The Select-String will give you a collection of objects that contains information about the match. The line property is the one you're interested in, so you can pull that directly.
Using the -replace operator, every time the .line property is a match of totaldisksize, you can run the regex on it. The $1 replacement will grab the group in the regex, the group being the part in parentheses (\d+,\d+) which will match one or more digits, followed by a comma, followed by one or more digits.
This will print to screen because by default powershell will print an object to the screen. Because you're only accessing the .line property, that's the only bit that's printed and also only after the replacement has been run.
If you wanted to explicitly use a Write-Host to see the results, or do anything else with them, you could store to a variable as follows:
$sizes = $(select-string totaldisksize .\XML\*.xml).line -replace '.*totaldisksize="(\d+,\d+)".*','$1'
$sizes | % { Write-Host $_ }
The above stores the results to an array, $sizes, and you iterate over it by piping it to the Foreach-Object or %. You can then access the array elements with $_ inside the block.
But.. but.. PowerShell knows XML.
$XMLfile = '<drives name="Virtual HD ATA Device" deviceid="\\.\PHYSICALDRIVE0" interface="IDE" totaldisksize="49,99"></drives>'
$XMLobject = [xml]$XMLfile
$XMLobject.drives.totaldisksize
Output
49,99
Or walk the tree and return the content of "drives":
$XMLfile = #"
<some>
<nested>
<tags>
<drives someOther="stuff" totaldisksize="49,99" freespace="22,33">
</drives>
</tags>
</nested>
</some>
"#
$drives = [xml]$XMLfile | Select-Xml -XPath "//drives" | select -ExpandProperty node
Output
PS> $drives
someOther totaldisksize freespace
--------- ------------- ---------
stuff 49,99 22,33
PS> $drives.freespace
22,33
XPath query of "//drives" = Find all nodes named "drives" anywhere in the XML tree.
Reference: Windows PowerShell Cookbook 3rd Edition (Lee Holmes). Page 930.
I am not sure about powershell but if you prefer using python below is the way of doing it.
import re
data = open('file').read()
item = re.findall('.*totaldisksize="([\d,]+)">', data)
print(item[0])
Output
49,99
I have a large text file containing filenames ending in .txt
Some of the rows of the file have unwanted text after the filename extension.
I am trying to find a way to search+replace or trim the whole file so that if a row is found with .txt, anything after this is simply removed. Example
C:\Test1.txt
C:\Test2.txtHelloWorld this is my
problem
C:\Test3.txt_____Annoying
stuff1234 .r
Desired result
C:\Test1.txt
C:\Test2.txt
C:\Test3.txt
I have tried with notepad++, or using batch/powershell, but got close, no cigar.
(Get-Content "D:\checkthese.txt") |
Foreach-Object {$_ -replace '.txt*', ".txt"} |
Set-Content "D:\CLEAN.txt"
My thinking here is if I replace anything (Wildcard*) after .txt then I would trim off what I need, but this doesnt work. I think I need to use regular expression, buy have the syntax wrong.
Simply change the * to a .*, like so:
(Get-Content "D:\checkthese.txt") |
Foreach-Object {$_ -replace '\.txt.*', ".txt"} |
Set-Content "D:\CLEAN.txt"
In regular expressions, * means "0 or more times", and in this case it'd act on the final t of .txt, so .txt* would only match .tx, .txt, .txtt, .txttt, etc...
., however, matches any character. This means, .* matches 0 or more of anything, which is what you want. Because of this, I also escaped the . in .txt, as it otherwise could break on filenames like: alovelytxtfile.txt, which would be trimmed to alovel.txt.
For more information, see:
Regex Tutorial - .
Regex Tutorial - *