Search for property values within specific HTML tags - regex

Using Visual Studio, within a large ASP.NET project I need to find all images that have a HTML class of "info". These images however can be applied using the following approaches:
Directly in the page, <img alt="..." title="..." class="info" />
As an ASP image, <asp:Image runat="server" ImageUrl="..." CssClass="info" />
As string concatenation, Dim s = "<img .... class=""info"" />" (notice double quotes)
Other hurdles are that images may have multiple classes, e.g. <img ... class="foo info bar" />, so a search for class="info" doesn't work. Also, other HTML elements also use this class, but should be ignored, e.g. <p class="info">Foo</p>.
I need a Regular Expression for searching that provides the following logic:
Must contain img or asp:Image (case-insensitive)
Must contain class or CssClass (case-insensitive)
Must contain info (case-sensitive)

It turns out the problem was the Visual Studio doesn't accept regular expressions that are pasted verbatim.
The test I did worked fine online (see working example)
/(img|asp:Image)(?=.*class\b)(?=.*\binfo\b).*$/igm
However, this failed to find anything in Visual Studio. I didn't realise that I needed to remove the start and end characters. Visual Studio required this revision, which works fine:
(img|asp:Image)(?=.*class\b)(?=.*\binfo\b).*$
Credit to this answer which was a lead in the right direction.

Related

Bootstrap group list shows text sprites at runtime

The code is listed below, using the bootstrap class class="list-group-item-text".
As the image here shows, the text seems to have a duplicate version just above it - looks almost like dotted lines.
`
<asp:Panel ID="pnlInstructions" runat="server" Visible="true">
<h4>Instructions:</h4>
<div class="form-inline col-lg-12" id="Instructions" style="padding-top:10px;padding-bottom:20px;padding-left:10px;">
<ol class="list-group-item-text" style="text-emphasis:filled;outline:none;">
<li>First, please use MyEd to get your extract file ready.</li>
<li>Then, fill in the following to Log in.</li>
</ol>
</div>
</asp:Panel>
`
I've researched the problem using the word "sprites", but that seems to have a different meaning than I expected, since I thought it meant "unwanted junk" in a display.
I'm not sure if this appears on all browsers.

An advanced Find and Replace with RegEx in Sublime Text

I have a directory full of Classic ASP legacy code, and almost all files have something similar to this:
<input type="hidden" name="driverPayment" value="<% =driverPayment %>">
Then later in the code, some JavaScript is running, and doing the following:
var x = document.getElementById('driverPayment')
This works fine in IE, but of course doesn't work anywhere else because there is no ID attribute defined.
The fix is to go through the 1770 files and manually add an ID attribute that matches the name property on the input. So make it like so:
<input type="hidden" name="driverPayment" id="driverPayment" value="<% =driverPayment %>">
Is there a way I can automate this process by using the logic below:
Get input element with a name attribute
If input has id attribute, move to next input
Else add an ID attribute to the input, and give it a name matching the value of the name attribute
I'd like to do this for the 1770 Classic ASP files I have. I am using Sublime Text.
You can use regex to match. My regex isn't great but the following should work. Happy for others to improve on it. I used some regex from this question.
Right Click project folder
Choose "Find in folder" option
Find and replace options appear at bottom of screen. Select the regex option (far left).
Enter
<(?:input)\s+(?=[^>]*\bname\s*=)(?![^>]*\bid\s*=)[^>]*>?(name="driverPayment")
in Find field
Enter
id="driverPayment" name="driverPayment"
in Replace field
Click Replace

Regular expression to remove special characters in JSTL tags

I am working on a Spring application and in JSPX page I need to dynamically load some values from properties page and set them as dropdown using options tag. I need to use same text for options value and for displaying but for options value, I need to remove all special characters.
For example if value is Maternal Uncle, then I need
<option value="MaternalUncle">Maternal Uncle</option>
What I am getting is
<option value="Maternal Uncle">Maternal Uncle</option>
There are 2 applications which can use that page and which properties file to load depends on app. If I load values for app 1 then values get displayed properly, Last value in app1 is 'Others' and does not has any special characters. For app 2 it does not trims whitespaces where last value is 'Maternal Uncle'. repOptions in code is ArrayList with values loaded from properties file. Here is my code:
<select name="person" id="person">
<option value="na">Select the relationship</option>
<c:forEach items="${repOptions}" var="repOption">
<option value="${fn:replace(repOption, '[^A-Za-z]','')}">${repOption}</option>
</c:forEach>
</select>
First app removes whitespaces as this value is 4th in list of 9. For app2 , this is last value and regex does not works. If I put Maternal Uncle as first property for app2 then this works fine but requirements is to have it last option.
<option value="${fn:replace(repOption, ' ','')}">
is working for whitespaces but there can be values like Brother/Sister, so I need to remove / also, hence I am using regex.
The JSTL fn:replace() does not use a regular expression based replacement. It's just an exact charsequence-by-charsequence replacement, exactly like as String#replace() does.
JSTL does not offer another EL function for that. You could just homegrow an EL function yourself which delegates to the regex based String#replaceAll().
E.g.
package com.example;
public final class Functions {
private Functions() {
//
}
public static String replaceAll(String string, String pattern, String replacement) {
return string.replaceAll(pattern, replacement);
}
}
Which you register in a /WEB-INF/functions.tld file as follows:
<?xml version="1.0" encoding="UTF-8" ?>
<taglib
xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-jsptaglibrary_2_1.xsd"
version="2.1">
<display-name>Custom Functions</display-name>
<tlib-version>1.0</tlib-version>
<uri>http://example.com/functions</uri>
<function>
<name>replaceAll</name>
<function-class>com.example.Functions</function-class>
<function-signature>java.lang.String replaceAll(java.lang.String, java.lang.String, java.lang.String)</function-signature>
</function>
</taglib>
And finally use as below:
<%#taglib uri="http://example.com/functions" prefix="f" %>
...
${f:replaceAll(repOption, '[^A-Za-z]', '')}
Or, if you're already on Servlet 3.0 / EL 2.2 or newer (Tomcat 7 or newer), wherein EL started to support invoking methods with arguments, simply directly invoke String#replaceAll() method on the string instance.
${repOption.replaceAll('[^A-Za-z]', '')}
See also:
How to call parameterized method from JSP using JSTL/EL

Could anyone tell me why / how this XSS vector works in the browser?

I have suffered a number of XSS attacks against my site. The following HTML fragment is the XSS vector that has been injected by the attacker:
<a href="mailto:">
<a href=\"http://www.google.com onmouseover=alert(/hacked/); \" target=\"_blank\">
<img src="http://www.google.com onmouseover=alert(/hacked/);" alt="" /> </a></a>
It looks like script shouldn't execute, but using IE9's development tool, I was able to see that the browser translates the HTML to the following:
<a href="mailto:"/>
<a onmouseover="alert(/hacked/);" href="\"http://www.google.com" target="\"_blank\"" \?="">
</a/>
After some testing, it turns out that the \" makes the "onmouseover" attribute "live", but i don't know why. Does anyone know why this vector succeeds?
So to summarize the comments:
Sticking a character in front of the quote, turns the quote into a part of the attribute value instead of marking the beginning and end of the value.
This works just as well:
href=a"http://www.google.com onmouseover=alert(/hacked/); \"
HTML allows quoteless attributes, so it becomes two attributes with the given values.

Extract all Images from HTML whose width or height higher than a specified value - Regex

I'm trying to make a small link share function with Classic ASP like LinkedIn or Facebook.
What I need to do is to get HTML of remote URL and extract all the images whose width are greater than 50px for example.
I can crawl and take the HTML and also I can find the images with this regex:
<img([^<>+]*)>
It matches; <img src="/images/icon.jpg" width="60" height="90" style="display:none"/>
Then I'm able to extract the path but sometimes it matches <img src="/track.php" style="display:none" width="1" height="1"/> which is not a real image.
Anyway, I feel like you are gonna be mad because of classic ASP but my company ....
I know there are lots of topics about this issue and mostly, they recommend not to USE regex but I couldn't find a way to this with classic asp. Is there a component or something to this?
Regards
This will get you close:
<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>
It accepts image tags with a width of 50 or greater.
Edit: tags with unspecified widths:
<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>|<img ((?!width=)[^>])*>