How to replace all anchor tags with a different anchor using regex in ColdFusion - regex

I found a similar question here: Wrap URL within a string with a href tags using Coldfusion
But what I want to do is replace tags with a slightly modified version AFTER the user has submitted it to the server. So here is some typical HTML text that the user will submit to the server:
<p>Terminator Genisys is an upcoming 2015 American science fiction action film directed by Alan Taylor. You can find out more by clicking here</p>
What I want to do is replace the <a href=""> part with a new version which would be like this:
...
clicking here
So I'm just adding the text rel="nofollow noreferrer" to the tag.
I must match anchor tags that contain a href attribute with a URL, not just the URL string itself, because sometimes a user could just do this:
<p>Terminator Genisys is an upcoming 2015 American science fiction action film directed by Alan Taylor. You can find out more by http://www.imdb.com</p>
In which case I still only want to replace the tag. I don't want to touch the actual anchor text used even though it is a URL.
So how could I rewrite this Regex
#REReplaceNoCase(myStr, "(\bhttp://[a-z0-9\.\-_:~###%&/?+=]+)", "\1", "all")#
the other way round, where its selecting tags and replacing them with my modified text?

If you're willing, this is a really easy task for jQuery (client-side)
JSFiddle: http://jsfiddle.net/mz1rwo0u/
$(document).ready(function () {
$("a").each(function(e) {
if ($(this).attr('href').match(/^https?:\/\/(www\.)?imdb\.com/i)) {
$(this).attr('rel','nofollow noreferrer');
}});
});
(If you right click any of the imdb links and Inspect Element, you'll see the rel attribute is added to the imdb links. Note that View Source won't reflect the changes, but Inspect Element is the important part.)
If you want to effect every a link, you can do this.
$(document).ready(function () {
$("a").each(function(e) {
$(this).attr('rel','nofollow noreferrer');
});
});
Finally, you can also use a selector to narrow it down, you might have the content loading into a dom element with the id contentSection. You can do...
$(document).ready(function () {
$("#contentSection a").each(function(e) {
if ($(this).attr('href').match(/^https?:\/\/(www\.)?imdb\.com/i)) {
$(this).attr('rel','nofollow noreferrer');
}});
});
It's a bit tougher to reliably parse this in cold fusion without the possibility of accidentally adding it twice (without invoking a tool like jSoup) but the jQuery version is client-side and works by obtaining data from the DOM rather than trying to hot-wire into it (a jSoup implementation works similarly, creating a DOM-like structure you can work with).
When talking about client-side vs server-side, you have to consider the mythical user who doesn't have javascript enabled (or who turns it off with malicious intent). If this functionality is not mission-critical. I'd use JQuery to do it. I've used similar functionality to pop an alert box when the user clicks an outside link on one of my sites.
Here's a jSoup implementation, quick and dirty. jSoup is great for how it selects similarly to jQuery.
<cfscript>
jsoup = CreateObject("java", "org.jsoup.Jsoup");
HTMLDocument = jsoup.parse("<A href='http://imdb.com'>test</a> - <A href='http://google.com'>google</a>");
As = htmldocument.select("a");
for (link in As) {
if (reFindnoCase("^https?:\/\/(www\.)?imdb\.com",link.attr("href"))) {
link.attr("rel","nofollow noreferrer");
}
}
writeOutput(htmldocument);
</cfscript>

Related

Removing entire tags containing a specific term using regex

I am altering a database with approximately 500 html pages using phpmyadmin.
Several pages contain a Facebook Pixel or Google Tag that I would like to remove.
The easiest way I thought would be to search via regex the entire tag that contains some expression or term related to Facebook or Google, and replace it with blank.
An example would be
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-XXXXXXXX');
</script>
or
<script>
(window, document, 'script', 'https://connect.facebook.net/en_US/fbevents.js');
fbq('init', '9999999999999999');
fbq('track', 'salespage_xxxxxx');
</script>
Although all are unique, some have the same code or another element that makes it possible to identify each one of them.
Before running in myphpadmin, I'm trying to formulate the expression using SublimeText3
It's the first contact I have with the regex and I found it fascinating, but even following some references I can't match the search.
The expression I came up with after some research was
<(.*)>[\s\S]face[\s\S]<\/(.*)>
Where I thought the expression would select the entire tag containing the word "face", but it doesn't find anything.
I would like some help.
If it works, it would be able to make several other necessary changes.
This regex expression will match the <script> tag that contains the face keyword
<(script)>(?:(?!<\/\1>|face)[\s\S])+face(?:(?!<\/\1>)[\s\S])+<\/\1>
See example: https://regex101.com/r/LfRlBV/1

Extract Sharepoint 2013 wiki page HTML Source / Display wiki page without master layout in iFrame?

What I am trying to achieve is to find a way of displaying a wiki page content in a floating iFrame ( and of course keep the styling ) for a tool that I am developing for our employees. Right now the tool I made is using jQuery dialog box to display a specific document / pdf, For compatibility and usability purposes I would really like to upgrade that so it uses a wiki page instead of documents / PDFs.The problem that I am facing is that there is no really direct link to the content of a Sharepoint wiki page instead the only available direct link is the one to the page all together with all the navigation menus, option panel, user panel etc. I want to avoid using javascrip to strip away these elements. Instead I am simply trying to find out if sharepoint 2013 has some more elegant way of providing the content such as: Web service or javascript SP API.
My ideas so far:
REST Url to give the content back? I know for sure it works for lists and libraries but I couldn't find anything in the REST API About wiki page content
SP.js ? Couldn't find anything about that either
Anyways, it could be possible that I have overlooked things, or probably haven't searched hard enough. However, any help is very very welcome. If you don't know about a concrete solution I would be very happy with nice suggestions too :)
If there is nothing out of the box I will have to get to my backup plan of a jQuery solution to get the page and strip off all unnecessary content and keep the styling.
I believe you are on the right track with REST API, in Enterprise Wiki Page the content is stored in PublishingPageContent property.
The following example demonstrates how to retrieve Enterprise Wiki Page content:
var getWikiPageContent = function (webUrl,itemId,result) {
var listTitle = "Pages";
var url = webUrl + "/_api/web/lists/GetByTitle('" + listTitle + "')/items(" + itemId + ")/PublishingPageContent";
$.getJSON(url,function( data ) {
result(data.value);
});
}
Usage
getWikiPageContent('https://contoso.sharepoint.com/',1,function(pageContent){
console.log(pageContent);
});
And something for those of you who like to have more than one different examples:
var inner_content;
var page_title = "home";
$.ajax({
url: "https://mysharepoint.sharepoint.com/MyEnterpriseWikiSite/_api/web/Lists/getbytitle('Pages')/items?$filter=Title eq '" + page_title +"'",
type: "GET",
headers: {
"ACCEPT": "application/json;odata=verbose"
},
success: function (data) {
if (data.d.results[0]) {
inner_content = data.d.results[0].PublishingPageContent;
}
},
error: function(){ //Show Error here }
});
That's what did the job for me.
This example fetches the inner content of an Enterprise wiki page by Title( make sure you are not using the Name of the page, although Title and Name can be given the same string value, they are different fields in sharepoint 2013 )

Hyperlinks inside object fields

I have an object inside my Ember app, with a description field. This description field may contain hyperlinks, like this
My fancy text <a href='http://other.site.com' target='_blank'>My link</a> My fancy text continues...
However, when i output it normally, like {{ description }} my hyperlinks are displayed as a plain text. Why is this happening and how can i fix this?
Handlebars escapes any HTML within output by default. For unescaped text in markup use triple-stashes:
{{{ description }}}
There's an alternative when one controls the property: Handlebars.SafeString. SafeStrings are assumed to be safe and are not escaped either. From the documentation:
Handlebars.registerHelper('link', function(text, url) {
text = Handlebars.Utils.escapeExpression(text);
url = Handlebars.Utils.escapeExpression(url);
var result = '' + text + '';
return new Handlebars.SafeString(result);
});
Note - please be careful with this. There are security concerns with rendering unescaped text that has come from user input; an attacker could inject a malicious script into the description and hijack your page, for example.

JavaScript Regx to remove certain string if a pattern is found

Lets say i have
input string as
<div id="infoLangIcon"></div>ARA, DAN, ENGLISHinGERMAN, FRA<div id="infoPipe"></div><div id="infoRating0"></div><div id="infoPipe"></div><div id="infoMonoIcon"></div>
so i want to check if inforating is 0 and then remove the div and previous div also. The output is
<div id="infoLangIcon"></div>ARA, DAN, ENGLISHinGERMAN, FRA</div><div id="infoPipe"></div><div id="infoMonoIcon"></div
Regex is not your best option here. It is not reliable when it comes to HTML.
I suggest you use DOM functions to do this (I gave you a Javascript example, you have not provided a language to be used). If I understood correctly, if there is an element with the ID of infoRating0, you want to remove it and its previous sibling. This little snippet should do that:
if (document.getElementById('infoRating0')) {
var rating0=document.getElementById('infoRating0'),
rParent=rating0.parentNode;
rParent.removeChild(rating0.previousSibling);
rParent.removeChild(rating0);
}
Also, your HTML is invalid. You can only use an ID once in your HTML. You have two divs with the same ID (infoPipe) which you should REALLY fix. Use classes instead.
jsFiddle Demo

Replacing a substring using regular expressions

I want to add a call to a onclick event in any href that includes a mailto: tag.
For instance, I want to take any instance of:
<a href="mailto:user#domain.com">
And change it into:
<a href="mailto:user#domain.com" onclick="return function();">
The problem that I'm having is that the value of the mailto string is not consistent.
I need to say something like replace all instances of the '>' character with 'onclick="return function();">' in strings that match '<a href="mailto:*">' .
I am doing this in ColdFusion using the REreplacenocase() function but general RegEx suggestions are welcome.
The following will add your onclick to all mailto links contained withing a string str:
REReplaceNoCase(
str,
"(<a[^>]*href=""mailto:[^""]*""[^>]*)>",
"\1 onclick=""return function();"">",
"all"
)
What this regular expression will do is find any <a ...> tag that looks like it's an email link (ie. has an href attribute using the mailto protocol), and add the onclick attribute to it. Everything up to the end of the tag will be stored into the first backreferrence (referred to by \1 in the replacement string) so that any other attributes in the <a> will be preserved.
If the only purpose of this is to add a JavaScript event handler, I don't think Regex is the best choice. If you use JavaScript to wire up your JavaScript events, you'll get more graceful degradation if JS is not available (e.g. nothing will happen, instead of having onclick cruft scattered throughout your markup).
Plus, using the DOM eliminates the possibility of missing matches or false positives that can occur from a Regex that doesn't perfectly anticipate every possible markup formation:
function myClickHandler() {
//do stuff
return false;
}
var links = document.getElementsByTagName('a');
for(var link in links) {
if(link.href.indexOf('mailto:') == 0) {
link.onclick = myClickHandler;
}
}
Why wouldn't you do this on the frontend with a library like jQuery?
$(function(){
$("a[href^=mailto]").click(function(){
// place the code you want to execute here
})
});