Replacing particular occurrence of a string with comment - regex

I am trying to replace a particular xml statement and making it as a comment.I am trying for some linux awk,sed or any regular grammer expression,but completely stucked is therey anyway by which i can achieve this task.Below is the scenario i am looking for.
For Example
I have a n numbers of xml files. I want to replace a statement which has a word "Distribution_Facilities_carrying_Item" and should get replace with comment statement.
suppose the statement is ----
<Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
.....as this statement contains the word "Distribution_Facilities_carrying_Item" i will replace this statement as a comment.So i want it to get replaced as
<!--Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter-->
Further all such a statement in all the xml files should get replaced as a commented xml statement.Below is the pattern in which they might occcur.So how should i go about it.I know one needs to be an adept in the regular expression,because it's the only way to achieve.
......................................
This statement can be there in n number of xml files.
File:a.xml
<Parameter name="RelationshipName1" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" eval="constant" type="string" name="RelationshipName3">Distribution_Facilities_carrying_Item</Parameter>
<Parameter name="RelationshipName" direction="in" eval="constant" type="string">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" name="RelationshipName10" type="string" eval="constant">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" name="RelationshipName11" type="string" eval="constant">Distribution_Facilities_carrying_Item</Parameter>
<Parameter direction="in" eval="constant" type="string" name="RelationshipName5">Distribution_Facilities_carrying_Item</Parameter>
Thanks in advance!!

Using sed:
sed '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' inputfile
would comment all lines containing the string Distribution_Facilities_carrying_Item.
If you want to modify the file in-place, add the -i option:
sed -i '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' inputfile
If this is to be performed for all .xml files in a directory, use find and -exec:
find /some/dir -maxdepth 1 -type f -name "*.xml" -exec sed -i '/Distribution_Facilities_carrying_Item/ s/<\(.*\)>/<!--\1-->/' {} \;
(Remove -maxdepth 1 from the find command if you want to do it recursively.)

check with below sed equation it will comment
sed -i 's/\(<.*Distribution_Facilities_carrying_Item.*>\)/<!--\1-->/' filename.xml

Do not use regular expressions to parse XML. Use a proper parser. For example, using xsh:
my $search = "Distribution_Facilities_carrying_Item" ;
for my $file in { #ARGV } {
open $file ;
for my $p in //Parameter[text() = $search]
xinsert comment { $p->toString } replace $p ;
save :b ;
}
If you want to delete the text, too, you can change the inner loop to
for my $p in //Parameter[text() = $search] {
delete $p/text() ;
xinsert comment { $p->toString } replace $p ;
}

An awk version:
awk '/Distribution_Facilities_carrying_Item/ {sub(/^</,"<!--");sub(/>$/,"-->")}1' a.xml

Related

How to find properties containing matching certain pattern using xmllint

I am trying to extract a value in a shell script using xmllint, I was able to find and extract values by matching complete key strings.
The problem is for some values I just know what the key starts with.
For example: let a part of xml be:
<property>
<name>foo.bar.random_part_of_name</name>
<value> SOME_VALUE</value>
</property>
I want to extract this entire segment as write it to an output file.
So far, I have been able to match complete segments with
if (xmllint --xpath '//property[name/text()="foo.bar"]/value/text()' "$INPUT_FILE"); then
value=$(xmllint --xpath '//property[name/text()="foo.bar"]/value/text()' "$INPUT_FILE")
echo "<property><name>foo.bar</name><value>$value</value></property>">> $OUTPUT_FILE
fi
Thanks in advance
Xpath 1.0 offers start-with(node, pattern) function to do what you want
name="foo.bar"
value=$(xmllint --xpath "//property[starts-with(name,'$name')]/value/text()" test.xml)
if [ -n "$value" ]; then
echo "<property><name>$name</name><value>$value</value></property>"
fi
Result:
<property><name>foo.bar</name><value> SOME_VALUE</value></property>

Extract specific XMLs from log file

I have large log files (around 50mb each), which contain java debug information plus all kinds of XML responses
Here's an example of something I'm trying to extract from the log
<envelope>
<response>
<ATTR name="uniqueid" value="XYZ_00000-00-00_12345_1"/>
<ATTR name="status" value="Activated"/>
<ATTR name="datecreated" value="2018/10/04 09:39:05"/>
</response>
</envelope>
I need only the XMLs which the uniqueid attribute contains "12345" and the status attribute is set to "Activated"
By using "sed" I'm able to extract all the envelopes, and currently I'm using regex to check if the above conditions exist inside of it (by running all of them in a loop).
sed -n '/<envelope>/,/<\/envelope>/p' logfile
What would be a proper solution to extract what I need from the file?
Thanks!
assuming your xml is formatted as shown, this should work...
$ awk '/<envelope>/ {line=$0; p=0; next}
line {line=line ORS $0}
/uniqueid/ && $3~/12345/ {p=1}
/<\/envelope>/ && p {print line}' file
with the opening tag, start accumulating the lines, if the desired line found set the flag, with the end tag if the flag is set print the record.
with gawk you can do this instead
$ awk -F'\n' -v RS='</envelope>\n' \
'$3~/uniqueid.*12345/ && $4~/status.*Activated/{print $0, RT}' file
there will be an extra newline though.

Regex to extract http links from an XML file

I have an xml file with many lines like:
<xhtml:link vip="true" href="http://store.vcenter.com/stores/en/product/tigers-midi/100" />
How do I extract just the link - http://store.vcenter.com/stores/en/product/tigers-midi/100?
I tried http://www\.\.com[^<]+ but that captures everything untill the end of the line - including quotes and closing XML tags.
I'm using this expression with egrep.
Don't parse HTML with regex, use a proper XML/HTML parser.
Check: Using regular expressions with HTML tags
You can use one of the following :
xmllint
xmlstarlet
saxon-lint
File:
<root>
<xhtml:link vip="true" href="http://store.vcenter.com/stores/en/product/tigers-midi/100" />
</root>
Example with xmllint :
xmllint --xpath '//*[#vip="true"]/#href' file.xml 2>/dev/null
Output:
href="http://store.vcenter.com/stores/en/product/tigers-midi/100"
If you need a quick & dirty one time command, you can do:
egrep -o 'https?://[^"]+' file

PowerShell string replacement

I am trying to build a PowerShell script such that I give it an input file and regex, it replaces the matching content with the environment variable.
For example,
If the input file contains the following:
<?xml version="1.0" encoding="utf-8"?>
<Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/Services" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Parameters>
<Parameter Name="IntegrationManager_PartitionCount" Value="1" />
<Parameter Name="IntegrationManager_MinReplicaSetSize" Value="2" />
<Parameter Name="IntegrationManager_TargetReplicaSetSize" Value="#{INT_MGR_IC}" />
<Parameter Name="EventManager_InstanceCount" Value="#{EVT_MGR_IC}" />
<Parameter Name="Entities_InstanceCount" Value="#{ENT_IC}" />
<Parameter Name="Profile_InstanceCount" Value="#{PRF_IC}" />
<Parameter Name="Identity_InstanceCount" Value="#{IDNT_IC}" />
</Parameters>
</Application>
I would like to build a script that replaces #{INT_MGR_IC} with the value of the INT_MGR_IC environment variable and so on.
If you know of such script or can point me in the right direction, it would be a great help. Specifically, I am interested to know how to:
Extract and loop over keys from the file such as: #{INT_MGR_IC}, #{EVT_MGR_IC}, etc.
Once I have the key, how do I replace it with an associated environment variable. For example, #{INT_MGR_IC} with INT_MGR_IC env. variable.
Thanks a lot for looking into this :)
UPDATE 1
This is the RegEx I am planning to use: /#{(.+)}/g
Just load the file using the Get-Content cmdlet, iterate over each Parmeter, filter all parameter that Value starts with an #using Where-Object and change the value. Finally, use the Set-Content cmdlet to write it back:
$contentPath = 'Your_Path_Here'
$content = [xml] (Get-Content $contentPath)
$content.DocumentElement.Parameters.Parameter | Where Value -Match '^#' | ForEach-Object {
$_.Value = "REPLACE HERE"
}
$content | Set-Content $contentPath
In case you need to determine an environment variable, you could use [Environment]::GetEnvironmentVariable($_.Value).
Thanks a lot for everyone who helped. Especially, #jisaak :)
Here is the final script I built that solved the problem in the question. Hopefully its useful to someone!
$configPath = 'Cloud.xml'
$config = [xml] (Get-Content $configPath)
$config.DocumentElement.Parameters.Parameter | Where {$_.Value.StartsWith("#{")} | ForEach-Object {
$var = $_.Value.replace("#{","").replace("}","")
$val = (get-item env:$var).Value
Write-Host "Replacing $var with $val"
$_.Value = $val
}
$config.Save($configPath)

Multi line find (grapping) and replace text in XML file using perl

Here i am trying to find and replace text content in one XML file using perl regular expression.
Sample XML Code:
<root>
<add>
<st>xxxx</st>
<pin>xxx</pin>
</add>
</root>
Now i want to find / grep text from <add> to </add> and replace <xyz>xxx</xyz>
<add>
<st>xxxx</st>
<pin>xxx</pin>
</add>
Note:
if above content are in single line i mean without line break in between <add> to </add>, as <add><st>xxxx</st><pin>xxx</pin></add> i can use <add>(.*)<\/add> to find / grep.
Thanking You
Thirusanguraja V
Using XML::XSH2, a wrapper around XML::LibXML:
open input.xml ;
$add = /root/add ;
delete $add/* ;
insert element xyz into $add ;
insert text 'xxx' into $add/xyz ;
save :b ;