I have some html text that I set into a TextField in flash. I want to highlight links ( either in a different colour, either just by using underline and make sure the link target is set to "_blank".
I am really bad at RegEx. I found a handy expression on RegExr :
</?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)/?>
but I couldn't use it.
What I will be dealing with is this:
<a href="http://randomwebsite.web" />
I will need to do a String.replace()
to get something like this:
<u><a href="http://randomwebsite.web" target="_blank"/></u>
I'm not sure this can be done in one go. Priority is making sure the link has target set to blank.
I do not know how Action Script regexes work, but noting that attributes can appear anywhere in the tag, you can substitute <a target="_blank" href= for every <a href=. Something like this maybe:
var pattern:RegExp = /<a\s+href=/g;
var str:String = "<a href=\"http://stackoverflow.com/\">";
str.replace(pattern, "<a target=\"_blank\" href=");
Copied from Adobe docs because I do not know much about AS3 regex syntax.
Now, manipulating HTML through regex is usually very fragile, but I think you can get away with it in this case. First, a better way to style the link would be through CSS, rather than using the <font> tag:
str.replace(pattern, "<a style=\"color:#00d\" target=\"_blank\" href=");
To surround the link with other tags, you have to capture everything in <a ...>anchor text</a> which is fraught with difficulty in the general case, because pretty much anything can go in there.
Another approach would be to use:
var start:RegExp = /<a href=/g;
var end:RegExp = /<\/a>/g;
var str:String = "<a\s+href=\"http://stackoverflow.com/\">";
str.replace(start, "<font color=\"#0000dd\"><a target=\"_blank\" href=");
str.replace(end, "</a></font>");
As I said, I have never used AS and so take this with a grain of salt. You might be better off if you have any way of manipulating the DOM.
Something like this might appear to work as well:
var pattern:RegExp = /<a\s+href=(.+?)<\/a>/mg;
...
str.replace(pattern,
"<font color=\"#0000dd\"><a target=\"_blank\" href=$1</a></font>");
I recomend you this simple test tool
http://www.regular-expressions.info/javascriptexample.html
Here's a working example with a more complex input string.
var pattern:RegExp = /<a href="([\w:\/.-_]*)"[ ]* \/>/gi;
var str:String = 'hello world <a href="http://www.stackoverflow.com/" /> hello there';
var newstr = str.replace(pattern, '<li><a href="$1" target="blank" /></li>');
trace(newstr);
What about this? I needed this for myself and it looks for al links (a-tags) with ot without a target already.
var pattern:RegExp = /<a ( ( [^>](?!target) )* ) ((.+)target="[^"]*")* (.*)<\/a> /xgi;
str.replace(pattern, '<a$1$4 target="_blank"$5<\/a>');
Related
I would like to strip <span> tags and their styles from within <h2> tags eg
<h2><span style="font-family: sans-serif;">{text to remain}</span></h2>
or
<h2><span style="font-family:font-size: 26.3158px;">{text to remain}</span></h2>
would become
<h2>{text to remain}</h2>
Any suggestions of how to achieve this with regex? Ideally in classic ASP (don't ask).
Thanks in advance
I whipped this out pretty quick, so it may have some issues, but it worked. You will need to update it for double quotes.
<%
text = "<h2><span style='font-family:font-size: 26.3158px;'>{text to remain}</span></h2>"
dim objRegExp : set objRegExp = new RegExp
objRegExp.Pattern = "<\/?span(\ style='.*')?>"
objRegExp.IgnoreCase = True
objRegExp.Global = True
cleanText = objRegExp.replace(text, "")
response.write text
response.write cleanText
%>
Results:
<h2><span style='font-family:font-size: 26.3158px;'>{text to remain}</span></h2>
<h2>{text to remain}</h2>
Got there in the end:
objRegExp.Pattern = "(<h2(.*?)>)(<span(.*?)>)(.*?)(<\/span>)(<\/h2>)"
vartext = objRegExp.Replace(vartext, "$1$5$7")
I'm not sure how to use ASP, but I'll take a stab and guess that it uses the same regex rules as JavaScript and hope that this might help you:
var str =
'<h2><span style="font-family: sans-serif;">{text to remain}</span></h2>';
// I put the no-regex version so you can see the difference in the output.
document.write(str);
document.write(str.replace(/<\/?\s*span[^>]*>/, ''));
Bear in mind that this is a very simple implementation and will fail in a case like: <span data-text="woah>asd">. In an ideal world, that'd be written woah>asd and thus not be a problem, but HTML isn't always valid.
I'm looking for a Regex to look for html tags based on their class name, and extract their value, for example:
<span class="myclass" id="myid">Hello world</span>
I need to extract - Hello world
I've tried doing that by my own but it seems to be more complicated than it looks
Some help? :)
Thanks!
You can try
var str = '<span class="myclass" id="myid">Hello world</span>';
var res = str.match("<([A-Za-z][A-Za-z0-9]*)\\b[^>]*>(.*?)</\\1>");
alert(res[2]);
I really prefer use a HTML parser.
But, if it is really needed, you can try this https://regex101.com/r/xP5kG7/1
.+(?<="myclass")[^>]+>([^<]+).+
It will give you the desirable output.
I am trying to take convert urls in a block of html to ensure they are lowercase.
Some of the links are a mix of uppercase and lowercase and they need to be converted to just lowercase.
It would be impossible to run round the site and redo every link so was looking to use a Regex when outputting the text.
<p>Hello world Some link.</p>
Needs to be converted to:
<p>Hello world Some link.</p>
Using a ColdFusion Regex such as below (although this doesn't work):
<cfset content = Rereplace(content,'(http[*])','\L\1','All')>
Any help much appreciated.
I think I would use the lower case function, lCase().
Put your URL into a variable, if it's not already:
<cfset MyVar = "http://www.ThisSite.com">
Force it to lower case here:
<cfset MyVar = lCase(MyVar)>
Or here:
<cfoutput>
Some Link
</cfoutput>
UPDATE: Actually, I see that what you are actually asking is how to generate your entire HTML page (or a big portion) and then go back through it, find all of the links, and then lower their cases. Is that what you are trying to do?
Since you have the HTML stored in a database, there is a bit more work that needs to be done than just using lcase(). I would wrap the functionality into a function that can be easily reused. Check out this code for an example.
content = '<p>Hello world Some link.</p>
<p>Hello world Some link.</p>
<p>Hello world <a href=''http://www.somelink.com/BLARG''>Some link</a>.</p>';
writeDump( content );
writeDump( fixLinks( content ) );
function fixLinks( str ){
var links = REMatch( 'http[^"'']*', str );
for( var link in links ){
str = replace( str, link, lcase( link ), "ALL" );
}
return str;
}
This has only been tested in CF9 & CF10.
Using REMatch() you get an array of matches. You then simply loop over that array and use replace() with lcase() to make the links lowercase.
And...based on Leigh's suggestion, here is a solution in one line of code using REReplace()
REReplace( content, '(http[^"'']*)', '\L\1', 'all' )
Use a HTML parser to parse HTML, not regex.
Here's how you can do it with jQuery:
<!doctype html>
<script src="jquery.js"></script>
<cfsavecontent variable="HtmlCode">
<p>Hello world Some link.</p>
</cfsavecontent>
<pre></pre>
<script>
var HtmlCode = "<cfoutput>#JsStringFormat(HtmlCode)#</cfoutput>";
HtmlCode = jQuery('a[href]',HtmlCode).each( lowercaseHref ).end().html();
function lowercaseHref(index,item)
{
var $item = jQuery(item);
// prevent non-links from being changed
// (alternatively, can check for specific domain, etc)
if ( $item.attr('href').startsWith('#') )
return
$item.attr( 'href' , $item.attr('href').toLowerCase() );
}
jQuery('pre').text(HtmlCode);
</script>
This works for href attributes on a tags, but can of course be updated for other things.
It will ignore in-page links like <a href="#SomeId"> but not stuff like <a href="/HOME/#SomeId"> - if that's an issue you'd need to update the function to exclude page fragment part (e.g. split on # then rejoin, or whatever). Same goes if you might have case-sensitive querystrings.
And of course the above is just jQuery because I felt like it - you could also use a server-side HTML parser, like jSoup to achieve this.
I have the following output.
<img width='70' height='70' class='centreimg' src="http://localhost/aktivfitness_new/assets/images/centre/centre1.jpg" />
There are same outputs with centre2.jpg etc.
Now I want to replace this centre1.jpg to hover.jpg when I hover.
But when I use the following it becomes centre1hover.jpg
$(".centreimg").mouseover(function() {
var hoverimg = $(this).attr("src").match(/[^\.]+/) + "hover.jpg";
...
Something is wrong with match(/[^.]+/) + "hover.jpg part.
How can I do this?
Thanks in advance.
Don't you think this would be easier:
var newSrc = oldSrc.replace(/[^\/\.]+\./, 'hover.');
Anyway: you shouldn't use Javascript for hovers =) If there is another way: use it. Not only is it a bad practice, it's also not user friendly: when the image loads when you hover, the user will see a 'flash' because the entire image still has to be loaded = not pretty.
I want to extract the image url from any website. I am reading the source info through webRequest. I want a regular expression which will fetch the Image url from this content i.e the Src value in the <img> tag.
I'd recommend using an HTML parser to read the html and pull the image tags out of it, as regexes don't mesh well with data structures like xml and html.
In C#: (from this SO question)
var web = new HtmlWeb();
var doc = web.Load("http://www.stackoverflow.com");
var nodes = doc.DocumentNode.SelectNodes("//img[#src]");
foreach (var node in nodes)
{
Console.WriteLine(node.src);
}
/(?:\"|')[^\\x22*<>|\\\\]+?\.(?:jpg|bmp|gif|png)(?:\"|')/i
is a decent one I have used before. This gets any reference to an image file within an html document. I didn't strip " or ' around the match, so you will need to do that.
Try this*:
<img .*?src=["']?([^'">]+)["']?.*?>
Tested here with:
<img class="test" src="/content/img/so/logo.png" alt="logo homepage">
Gives
$1 = /content/img/so/logo.png
The $1 (you have to mouseover the match to see it) corresponds to the part of the regex between (). How you access that value will depend on what implementation of regex you are using.
*If you want to know how this works, leave a comment
EDIT
As nearly always with regexp, there are edge cases:
<img title="src=hack" src="/content/img/so/logo.png" alt="logo homepage">
This would be matched as 'hack'.