I am trying to build a PowerShell script such that I give it an input file and regex, it replaces the matching content with the environment variable.
For example,
If the input file contains the following:
<?xml version="1.0" encoding="utf-8"?>
<Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/Services" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Parameters>
<Parameter Name="IntegrationManager_PartitionCount" Value="1" />
<Parameter Name="IntegrationManager_MinReplicaSetSize" Value="2" />
<Parameter Name="IntegrationManager_TargetReplicaSetSize" Value="#{INT_MGR_IC}" />
<Parameter Name="EventManager_InstanceCount" Value="#{EVT_MGR_IC}" />
<Parameter Name="Entities_InstanceCount" Value="#{ENT_IC}" />
<Parameter Name="Profile_InstanceCount" Value="#{PRF_IC}" />
<Parameter Name="Identity_InstanceCount" Value="#{IDNT_IC}" />
</Parameters>
</Application>
I would like to build a script that replaces #{INT_MGR_IC} with the value of the INT_MGR_IC environment variable and so on.
If you know of such script or can point me in the right direction, it would be a great help. Specifically, I am interested to know how to:
Extract and loop over keys from the file such as: #{INT_MGR_IC}, #{EVT_MGR_IC}, etc.
Once I have the key, how do I replace it with an associated environment variable. For example, #{INT_MGR_IC} with INT_MGR_IC env. variable.
Thanks a lot for looking into this :)
UPDATE 1
This is the RegEx I am planning to use: /#{(.+)}/g
Just load the file using the Get-Content cmdlet, iterate over each Parmeter, filter all parameter that Value starts with an #using Where-Object and change the value. Finally, use the Set-Content cmdlet to write it back:
$contentPath = 'Your_Path_Here'
$content = [xml] (Get-Content $contentPath)
$content.DocumentElement.Parameters.Parameter | Where Value -Match '^#' | ForEach-Object {
$_.Value = "REPLACE HERE"
}
$content | Set-Content $contentPath
In case you need to determine an environment variable, you could use [Environment]::GetEnvironmentVariable($_.Value).
Thanks a lot for everyone who helped. Especially, #jisaak :)
Here is the final script I built that solved the problem in the question. Hopefully its useful to someone!
$configPath = 'Cloud.xml'
$config = [xml] (Get-Content $configPath)
$config.DocumentElement.Parameters.Parameter | Where {$_.Value.StartsWith("#{")} | ForEach-Object {
$var = $_.Value.replace("#{","").replace("}","")
$val = (get-item env:$var).Value
Write-Host "Replacing $var with $val"
$_.Value = $val
}
$config.Save($configPath)
Related
My XML input file looks like this:
...
<logos>
<logo name="" primary="true" guid="c6aae8fe-bb04-4067-9b14-18b1bcf940d3" />
<logo name="" primary="false" guid="68b55f4d-f401-4180-b0e0-160974758348" />
</logos>
...
I need to remove the content, keeping the node. Expected output:
<logos></logos>
My command looks like this:
sed -i 's|\(<logos>\)\(.+\)\(</logos>\)|\1\3|gi' $filename
But it ain't working. What am I missing?
Edit: this is not a duplicate of delete node in a xml file with sed : that question is about deleting the whole node. Here I need to delete the content of the node only.
You could use address ranges in addition to c command:
sed -i.bak '/<logos>/,/<\/logos>/c<logos></logos>' $filename
sed and alike would be a bad choice for such cases.
Use a proper XML/HTML parsers.
xmlstarlet solution:
Sample input.xml:
<root>
<logos>
<logo name="" primary="true" guid="c6aae8fe-bb04-4067-9b14-18b1bcf940d3"/>
<logo name="" primary="false" guid="68b55f4d-f401-4180-b0e0-160974758348"/>
</logos>
</root>
xmlstarlet ed -O -d '//logos/*' input.xml
The output:
<root>
<logos/>
</root>
I want to get rid of the xml-code from within more than 100 xml-files.
I want to use PowerShell. Here is one sample file:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../../../helpproject.xsl" ?><topic
template="Default" lasteditedby="liliya" xmlns:xsi="http://www.w3.org
/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../..
/../helpproject.xsd">
<title translate="true">Passwörter verwalten</title>
<body>
<header>
<para styleclass="Heading1"><text styleclass="Heading1"
translate="true">Passwörter verwalten</text></para>
</header>
<para styleclass="Normal"><table styleclass="container" rowcount="3"
colcount="2" style="width:970px;">
<tr style="vertical-align:top">
<td style="width:50%;">
<para styleclass="H1"><text styleclass="H1"
translate="true">Passwörter verwalten</text></para>
</td>
<td style="width:50%;">
<para styleclass="Image"><image src="manage_passwords.PNG"
scale="100.00%" styleclass="Image"><title translate="true">Passwörter
verwalten</title></image></para>
</td>
</tr>
</table></para>
<para styleclass="txt"/>
In Notepad++ after regex of <.+?> and ^\s+ I see just the text!
With this script I copy the originals (to leave them unchanged) to a single folder and then O just want to eliminate the xml-tags:
Get-ChildItem -Path "C:\Users\cas\Documents\Wurzel_XML\" -Recurse |
Where-Object Name -like "*.xml" |
Copy-Item -Destination "C:\Users\cas\Documents\check_xml\"
$newText = ($newText -replace "<.*?>", "").trim()|?{$_ -ne ''}
Get-ChildItem -Path "C:\Users\cas\Documents\check_xml\" |
Set-Content -Value $newText
But after that all the files are completely empty?
I previously tried
$newText = ($newText -replace "(?ms)^\s+<.*?</.*?>", "")
Get-ChildItem -Path "C:\Users\cas\Documents\check_xml\" |
Set-Content -Value $newText
with the same result.
What do I wrong with that Regex?
Thanks in advance,
Gooly
Do Not Use Regular Expression Processing To Parse HTML, XHTML, or XML
PowerShell has cmdlets that can be used to process XML, and the techniques that can be used with it have been discussed in many places (See this Google search). If you read your files as structured XML files, and then use the Select-XML cmdlet with appropriate XPath queries, you can extract the information you need, reliably - provided that your XML is well-formed in the first place.
My build.xml has 134 targets, most of which are hidden (hidden="true"). Is there a way to list all targets from the commandline? Target-definitions are sometimes split over multiple-lines, and I usually use double-quote-characters for properties. I'm running this on Debian, and kudos for sorting targets and/or also displaying descriptions. :-)
Examples:
<target name="example1" hidden="false" />
<target name="example3" hidden="true" />
<target
description="Ipsum lorem"
hidden="true"
name='example3'
>
<phingcall target="example1" />
</target>
We can't do this with Phing, but we can within Phing. There's probably a cleaner, better way than this, but this works - assuming all other properties are wrapped in double quotes (i.e. it just passes example#3, above)
<target name="list_all" hidden="false">
<property name="command" value="
cat ${phing.file.foo}
| perl -pe 's|^\s*||g'
| perl -0pe 's|\n([^<])| \1|gs'
| grep '<target'
| perl -pe "s|name='([^']*)'|name=\"\1\"|g"
| perl -pe 's|^<target(\s?)||'
| perl -pe 's|(.*)([ ]?)depends="([^"]*)"([ ]?)(.*)|\1 \2|g'
| perl -pe 's|(.*)([ ]?)hidden="([^"]*)"([ ]?)(.*)|\1 \2|g'
| perl -pe 's|.*description="([^"]*).*name="([^"]*).*|name="\2" description="\1"|g'
| perl -pe 's|name="([^"]*)"|\1|g'
| perl -pe 's|description="([^"]*)"|[\1]|g'
| sort
| uniq
" override="true" />
<exec command="${command}" passthru="true" />
</target>
What do those lines do?
1) output the contents of build.xml. Here, I'm interested in my 'global' build.xml which is named 'foo';
2) remove all leading whitespace from each line;
3) remove line-breaks within each opening tag;
4) filter for lines starting "<target";
5) change single quote-marks on name-property to double;
6, 7,8) remove leading "<target", and depends and hidden properties;
9) move 'description' after 'name' on each line;
10) remove 'name=' and its quote-marks;
11) replace 'description=' and its quote-marks with square-brackets; and
12, 13) sort & remove duplicates (if any)
Sample output:
example1
example2
example3 [Ipsum lorem]
You can't with phing itself. The code simply skips the display if the target is set to "hidden".
I have a file named test.txt with the following content
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test time="60" id="01">
<java.lang.String value="cat"/><java.lang.String value="dog"/>
<java.lang.String value="mouse"/>
<java.lang.String value="cow"/>
</test>
What I would like to do is that , i want to edit the file so that when i get something like , <java.lang.String value="something"/> i will change that part to <animal>something</animal>
So for previous example , after applying a script with sed/awk/grep command the file content will be changed to or a new file will be created like following:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test time="60" id="01">
<animal>cat</animal><animal>dog</animal>
<animal>mouse</animal>
<animal>cow</animal>
</test>
I tried to extract that particular part using following command :
$less test.txt | grep -Po 'java.lang.String value="\K[^"]*' | awk -F: '{print "<animal>" $1 "</animal>"}'
The output gives me the changed part, but I want this changed part along with the rest of the file unchanged :
<animal>cat</animal>
<animal>dog</animal>
<animal>mouse</animal>
<animal>cow</animal>
I am new to scripting , I don't know how to write the complete output in a file .
sed -r 's#<java.lang.String value="([^"]*)"/>#<animal>\1</animal>#g' test.txt
And you should not do XML transformations with regular expressions...
EDIT about how it works
By default sed uses "basic regular expressions", where many special characters have to be prefixed with \. -r flag switches to "extended regular expressions" where the syntax is less cumbersome. See OpenGroup for details.
By default sed prints output as-is unless commands modify it. The replacement command is like s#search_regexp#replacement#flags. The delimiter can be anything like /, #, or ,. I choose # so it doesn't clash with the \ character in XML.
Then we match things like <java.lang.String value="anything_except_quotes"/>. The part that we want to reuse has parenthesis, it's called a matching group. In the replacement we refer to the thing we captured inside the matching group by \1.
g flag makes sed replace all occurences of the search pattern, not only the first one.
ok some problems with your command:
less test.txt | grep -Po 'java.lang.String value="\K[^"]*' | awk -F: '{print "<animal>" $1 "</animal>"}'
to begin with, there's a useless use of less, grep can take a file as a parameter:
grep -Po 'java.lang.String value="\K[^"]*' test.txt | awk -F: '{print "<animal>" $1 "</animal>"}'
then you're using grep to select lines that matches a string, so basically, your sequence of commands is explicitely keeping only the lines that have the java.lang... string, taking everything else out... A simpler solution would be to use sed:
sed -r 's,<java.lang.String value="([^"]*)"\s*/>,<animal>\1</animal>,g' test.txt
which uses the substitution syntax of sed to replace the match, while extracting what's in the parenthesis ( and ) as \1 in the right part. The [^"] part is for matching everything that is not a " character, and the * operator is to apply the match 0 or more times. The \s is to match a space, *, 0 or more times.
A regex is an automaton that uses states and transitions to match a given string. Here's a visual of how the regex works:
demo of the regex on an example
Though in your particular case that simple regex works out, keep in mind that this is only a hack. You should instead use an XML parser and replace the nodes to match your needs, using XSLT/XSLFO that are tools designed to transform an XML into another one (or something else).
To do that, you could use a tool such as xsltproc and look at this Q for an example that transforms all foo nodes into bar nodes in an XML tree, here's how to do it:
test.xsl:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!--Identity Template. This will copy everything as-is.-->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!--Change "java.lang.String" element to "animal" element.-->
<xsl:template match="java.lang.String">
<animal>
<!-- get the attribute 'value' of java.lang.String -->
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</animal>
</xsl:template>
</xsl:stylesheet>
run:
xsltproc test.xsl test.xml
result:
<?xml version="1.0"?>
<test time="60" id="01">
<animal value="cat"/>
<animal value="dog"/>
<animal value="mouse"/>
<animal value="cow"/>
</test>
and by the way, given your XML, it looks like it has been generated by Java, and there's multiple ways to apply that XSL from within your code, even before you need to handle it using command line tools.
I am trying to replace a particular xml statement and making it as a comment.I am trying for some linux awk,sed or any regular grammer expression,but completely stucked is therey anyway by which i can achieve this task.Below is the scenario i am looking for.
For Example
I have a n numbers of xml files. I want to replace a statement which has a word "Distribution_Facilities_carrying_Item" and should get replace with comment statement.
suppose the statement is ----
<Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
.....as this statement contains the word "Distribution_Facilities_carrying_Item" i will replace this statement as a comment.So i want it to get replaced as
<!--Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter-->
Further all such a statement in all the xml files should get replaced as a commented xml statement.Below is the pattern in which they might occcur.So how should i go about it.I know one needs to be an adept in the regular expression,because it's the only way to achieve.
......................................
This statement can be there in n number of xml files.
File:a.xml
<Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" eval="constant" type="string" name="RelationshipName3">Distribution_Facilities_carrying_Item</Parameter>
<Parameter name="RelationshipName" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" name="RelationshipName10" type="string" eval="constant">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" name="RelationshipName11" type="string" eval="constant">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" eval="constant" type="string" name="RelationshipName5">Distribution_Facilities_carrying_Item</Parameter>
Thanks in advance!!
Using sed:
sed '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' inputfile
would comment all lines containing the string Distribution_Facilities_carrying_Item.
If you want to modify the file in-place, add the -i option:
sed -i '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' inputfile
If this is to be performed for all .xml files in a directory, use find and -exec:
find /some/dir -maxdepth 1 -type f -name "*.xml" -exec sed -i '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' {} \;
(Remove -maxdepth 1 from the find command if you want to do it recursively.)
check with below sed equation it will comment
sed -i 's/\(<.*Distribution_Facilities_carrying_Item.*>\)/<!--\1-->/' filename.xml
Do not use regular expressions to parse XML. Use a proper parser. For example, using xsh:
my $search = "Distribution_Facilities_carrying_Item" ;
for my $file in { #ARGV } {
open $file ;
for my $p in //Parameter[text() = $search]
xinsert comment { $p->toString } replace $p ;
save :b ;
}
If you want to delete the text, too, you can change the inner loop to
for my $p in //Parameter[text() = $search] {
delete $p/text() ;
xinsert comment { $p->toString } replace $p ;
}
An awk version:
awk '/Distribution_Facilities_carrying_Item/ {sub(/^</,"<!--");sub(/>$/,"-->")}1' a.xml