Replace a substring using RegEx - regex

Here is a line in my xyz.csproj file:
<Reference Include="SomeDLLNameHere, Version=10.2.6.0, Culture=neutral, PublicKeyToken=b88d1754d700e49a, processorArchitecture=MSIL" />
All I need to do is replace the 'Version=10.2.6.0' to 'Version=11.0.0.0' .
The program I need to do this in is VSBuild which uses VBScript so I believe.
The problem is that I can't hardcode the 'old' version number. I therefore need to replace the following :
<Reference Include="SomeDLLNameHere, Version=10.2.6.0,
I therefor need a regex that will match the above bearing in mind that that in the example quoted, the 10.2.6.0 could be anything.
I believe that a regex that would select the text including and between
'<Reference Include="SomeDLLNameHere' and '>' is what I need.
There are other references to similar requests but none seem top work for me.
I would normally use C# to do this sort of thing and VBScript/Regex is something I avoid like the plague.

For most regex flavors, you would use this:
<Reference Include="SomeDLLNameHere.*?/>
For visual studio, I am not sure if the *? would work... Try this:
\<Reference Include="SomeDLLNameHere[^/]*\/\>

This regex pattern should work
"(<Reference[^>]+Version=)([^,]+),"
Applied with VBScript
str1 = "<Reference Include=""SomeDLLNameHere, Version=10.2.6.0,"
' Create regular expression.
Set regEx = New RegExp
regEx.Pattern = "(<Reference[^>]+Version=)([^,]+),"
' Make replacement.
ReplaceText = regEx.Replace(str1, "$111.0.0.0,")
WScript.echo ReplaceText
Gives the correct result
<Reference Include="SomeDLLNameHere, Version=11.0.0.0,
UPDATE
if you need something that matches between Version= and the end of the tag use > instead of ,
"(<Reference[^>]+Version=)([^>]+)>"

Using Regex with C# or VBScript is pretty much the same because it all comes to developing the regular expression. Something like this could help:
<Reference\s+Include\s*=\s*\".+\",\s*Version\s*=\s*.+,
Not sure what are the rules about case sensitivity and white spaces in csproj files, but this covers the form you described previously. Note that the "+" operator means one or kleen.

Related

Match particular CDATA sections in XML data

I am trying to do a PowerShell Regex, I have the following page (further below) that I want to do a match from, the two parts in bold is the information that I want to capture and assign to a variable. So I need two regex's. From the text below, the two area's I need to find exactly are King and Years & Years. Please note, these two areas change (hence the reason I need to capture them), the rest of of the code stays the same.
This is the regex I have at the moment, but it's not working for me.
\s+artist\s*>\s*<\s*!\s*[CDATA\s*[(.*)\s*]\s*]\s*>\s*<\s*/artist
And here is the page (or data) I am trying to use regex with.
<on_air>
<publishedInfo publishedDate="2015-07-18 16:24:28" />
<stationName><![CDATA[Mix 106.5]]></stationName>
<stationPrefix><![CDATA[mix1065]]></stationPrefix>
<generic_coverart><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></generic_coverart>
<now_playing>
<audio ID="id_1705168034_30458146" type="song">
<title generic="False"><![CDATA[King*]]></title>
<artist><![CDATA[Years & Years]]></artist>
<number><![CDATA[46029]]></number>
<cut><![CDATA[1]]></cut>
<ref><![CDATA[]]></ref>
<played_datetime><![CDATA[2015-07-18 16:24:27]]></played_datetime>
<length><![CDATA[00:03:28]]></length>
<coverart generic="true"><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></coverart>
<options>
<option><![CDATA[KIIS S Integrated]]></option>
</options>
</audio>
</now_playing>
If it is a valid XML, then you does not need to use regular expressions. PowerShell adapt XML objects and you can use standard property syntax to navigate on them:
$xml=[xml]#'
<on_air>
<publishedInfo publishedDate="2015-07-18 16:24:28" />
<stationName><![CDATA[Mix 106.5]]></stationName>
<stationPrefix><![CDATA[mix1065]]></stationPrefix>
<generic_coverart><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></generic_coverart>
<now_playing>
<audio ID="id_1705168034_30458146" type="song">
<title generic="False"><![CDATA[King*]]></title>
<artist><![CDATA[Years & Years]]></artist>
<number><![CDATA[46029]]></number>
<cut><![CDATA[1]]></cut>
<ref><![CDATA[]]></ref>
<played_datetime><![CDATA[2015-07-18 16:24:27]]></played_datetime>
<length><![CDATA[00:03:28]]></length>
<coverart generic="true"><![CDATA[http://media.arn.com.au/images/getImage.aspx?i=generic_mix1065.jpg]]></coverart>
<options>
<option><![CDATA[KIIS S Integrated]]></option>
</options>
</audio>
</now_playing>
</on_air>
'#
$xml.on_air.now_playing.audio.title.'#cdata-section'
$xml.on_air.now_playing.audio.artist.'#cdata-section'
You want to escape bracket literals.
Also, it's a good practice to avoid using the dot "match almost any character" metacharacter when your intentions are more specific. In your case, what you really want to do is match until you hit the closing bracket, so it's safer to specify that:
'\s+artist\s*>\s*<\s*!\s*\[CDATA\s*\[([^]]*)\s*\]\s*\]\s*>\s*<\s*\/artist'
Note: Regex is contextual, so the reason I don't have to escape the closing bracket within the character class is because of its position, i.e., being the first character specified in the negated class--in that context, it cannot be the closing bracket for the character class. In other words, it's not ambiguous.
To help get off the ground, here is a suggestion for y&y (insert whitespace-selector whereever possible):
artist><!\[CDATA\[Years & Years\]\]></artist

Using Regex to wrap xml element value with cdata

I have to edit a stored procedure that builds xml strings so that all the element values are wrapped in cdata. Some of the values have already been wrapped in cdata so I need to ignore those.
I figured this is a good attempt to learn some regex
From: <element>~DATA_04</element>
to: <element><![CDATA[~DATA_04]]></element>
What are my options on how to do this? I can do simple regex, this is way more advanced.
NOTE: The <element> is generic for illustration purposes, in reality, it could be anything and is unknown.
Sample text:
declare #sql nvarchar(max) =
' <data>
<header></header>
<docInfo>Blah</docInfo>
<someelement>~DATA_04</someelement>
<anotherelement><![CDATA[~DATA_05]]></anotherelement>
</data>
'
Using the sample xml, the regex would need to find someelement and add cdata to it like <someelement><![CDATA[~DATA_04]]></someelement> and leave the other elements alone.
Bear in mind, I did not write this horrible sql code, i just have to edit it.
This is c#:
string text = Regex.Replace( inputString, #"<element>~(.+)</element>", "<element>![CDATA[~$1]]</element>" , RegexOptions.None );
The find is:
<element>~(.+)</element>
The replace is:
<element>![CDATA[~$1]]</element>
I'm assuming there is a ~ at the start of the inside of the element tag.
You will also want to watch out for whitespace if that is an issue...
You may want to add some
\s*
Any whitespace characters, zero or more matches
Try with (<[^>]+>)(\~data_([^<]+))(<[^>]+>)
and replace for \1<![CDATA[\2]]>\4
this will give you: <element><![CDATA[~DATA_04]]></element>,
where element could be anything else. Check the DEMO
Good luck

Coldfusion regex to not select an href with specific ID

I have an HTML parser doing the hard work, but I need a regex to select anchors that don't have an attriburte id="optout". Here's my current regex that selects all anchors that have href with http... this is great just needs to ignore those anchors with id="optout" -- any ideas?
Thanks!
<cfset matches = ReMatch('<a[^>]*href="http[^"]*"[^>]*>(.+?)</a>', arguments.htmlCode) />
Regex is the wrong tool for this task, and given that you've already got a HTML parser involved, there's no reason not to keep using it!
Here's the trivial way to do it with a HTML parser (jsoup):
jsoup.parse( Arguments.HtmlCode ).select('a:not([id=optout])')
Here's the far less maintainable regex way to do it:
rematch( '(?i)<a\s*(?:(?!id\s*=\s*[''"]optout[''"])[^>])+>(?:[^<]+|<(?!/a>))+</a>' , Arguments.HtmlCode )

Using ant <propertyregex>, how can I capture the /etc/shadow record for a user?

From ant, we want to extract a line from an old /etc/shadow file, capturing the line for a specific user name, such as "manager". This is part of a backup/restore operation. What we used previously was not specific enough, so it would match users like "mymanager", so we tried to tighten it down by anchoring the start of the string to beginning of the line (typically "^"). This definitely did not work as we expected.
How can we anchor so that we get an exact match for a username? -- answered below.
First attempt, which gave the wrong result if we had a user of "mymanager" in the /etc/shadow file copy:
<loadfile property="oldPasswords" srcFile="${backup.dir}/shadow"/>
<propertyregex property="manager.backup" input="${oldPasswords}"
regexp="(manager\:.*)" select="\1" casesensitive="true" />
Second attempt, which failed because "^" is not interpreted in the normal regular expression way by default:
<loadfile property="oldPasswords" srcFile="${backup.dir}/shadow"/>
<propertyregex property="manager.backup" input="${oldPasswords}"
regexp="^(manager\:.*)" select="\1" casesensitive="true" />
Kobi suggested adding -> flags="m" <- which sounded good but ant reported that the flags option is not supported by propertyregex.
The final, successful, approach required inserting "(?m)" at the beginning of the regexp: That was the essential change.
<propertyregex property="manager.backup" input="${oldPasswords}"
regexp="(?m)^manager:.*$" select="\0" casesensitive="true" />
The regexp with propertyregex appears to follow the rules in this documentation of regular expressions in Java (search for "multiline" for example): http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Check the above document if you have similar questions about how to make propertyregex and regexp do what you want them to do!
THANKS! Solved.
Alan Carwile
I think the m(ultiline) flag is what you want to use and will give the start-of-line anchor the right behavior. It's possible to change flags within the regular expression with the syntax (?<flagstoturnon>-<flagstoturnoff>). So in your case, adding (?m) to the start of the regular expression (before the caret) should work.

Need replace Syntax for Ant's propertyregex Task

I'm running up against my failure to understand regex substitution patterns and Apache Ant's limited documentation on propertyregex. My problem is that I need to take the ${user.name} property and make a lowercase version called ${user.name.lc} but I can't get the replace string correct.
This is what I've got:
<target name="foobar">
<echo>${user.name}</echo>
<propertyregex
property="user.name.lc"
input="${user.name}"
regexp="[A-Z]"
replace="[a-z]"
global="true" />
<echo>${user.name.lc}</echo>
</target>
It finds the upper case portions of the name correctly, but the replacement bombs. This is what I get:
foobar:
[echo] Sally Fields
[echo] [a-z]ally [a-z]ields
I've been googling and reading for about two hours trying different substitution strings. The ant document refers to groupings and shows examples with these. No help for me because there may or may not be groupings in the user name.
Can anyone provide me with what Ant says I need a "regular expression substitition pattern?"
my
Don't use regex for this. There are only a few regex engines which support what you are looking for and I don't think propertyregex is one of them. Use this instead :
<pathconvert property="converted">
<path path="${user.name}"/>
<chainedmapper>
<flattenmapper/>
<scriptmapper language="javascript">
self.addMappedName(source.toLowerCase());
</scriptmapper>
</chainedmapper>
</pathconvert>
<echo>${converted}</echo>
you can use %1> in the replace attribute. > is the standard regex symbol for converting to upper case, so you code will look like :
<propertyregex
property="user.name.lc"
input="${user.name}"
regexp="[A-Z]"
replace="%1>"
global="true" />