i have the following code, but i am very loose in the regular expression, i am using coldfusion
and i want to remove the code which is inbetween before every next page call
http://beta.mysite.com/?jobpage=2page=2#brands
what i am trying is if jobpage exists, it should remove the jobpage=2 from the URL, {2} is dynamic as it can be one or 2 or 3 and so on.
I tried with listfirst and listlast or gettoken but no help.
This should do it for you
<Cfset myurl = "http://beta.mysite.com/?jobpage=2page=2##brands" />
<cfoutput>#myurl#</cfoutput><br><Br>
<cfset myurl = ReReplaceNoCase(myurl,"(jobpage=[0-9]+[\&]?)","","ALL") />
<cfoutput>#myurl#</cfoutput>
Related
Here my scenario I want to get the cid value without cid: from the img src in mail content. I've the inline image code like <img src="cid:ii_k4ib6vux0" alt="image.png" width="195" height="162">.
Now I want to get the cid value ii_k4ib6vux0. When I try to use the regex cid([^""']+) I've got the value with cid like cid:ii_k4ib6vux0. But I want to get the value only. Please guide me to get the exact values.
Thanks in advance!
You could try this:
<cfset aString = '<img src="cid:ii_k4ib6vux0" alt="image.png" width="195" height="162">' />
<cfset aMatch = REMatch('"cid:([^"]*)"', aString) />
<cfdump var="#replace(aMatch[1], "cid:", "")#" />
Example:
https://trycf.com/gist/9202b3dd2cca2cf0341a594dc007644f/acf2016?theme=monokai
Update:
Using REFind
<cfset aString = '<img src="cid:ii_k4ib6vux0" alt="image.png" width="195" height="162">' />
<cfset aMatch = REFind('"cid:([^"]*)"',aString,1,true,"ALL") />
<cfdump var="#aMatch[1].match[2]#" />
https://trycf.com/gist/809a30fdc16cc6deac6d1034dfb8adc2/acf2016?theme=monokai
In order to capture only that code and not having to check the groups (parts between parenthesis), you can use a positive lookbehind:
(?<=cid:)[^"']+
This will match any character besides ' and " which come after cid:.
Otherwise, using a regex similar to yours (note the : and the removed "):
cid:([^"']+)
You need to check the first group. Depending on what language you are using, the first group may be the whole captured string and you may need to check the second group.
I'm using <cfhttp> to pull in content from another site (coldfusion) and resolveurl="true" so all the links work. The problem I'm having is resolveurl is making the anchor links (href="#search") absolute links as well breaking them. My question is is there a way to make resolveurl="true" bypass anchor links somehow?
For starters, let's use the tutorial code from Adobe.com posted in the comments. You'll want to do something similar.
<cfhttp url="https://www.adobe.com"
method="get" result="httpResp" timeout="120">
<cfhttpparam type="header" name="Content-Type" value="application/json" />
</cfhttp>
<cfscript>
// Find all the URLs in a web page retrieved via cfhttp
// The search is case sensitive
result = REMatch("https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?", httpResp.Filecontent);
</cfscript>
<!-- Now, Loop through those URLs--->
<cfoutput>
<cfloop array="#result#" item="item" index="index">
<cfif LEFT(item, 1) is "##">
<!---Your logic if it's just an anchor--->
<cfelse>
<!---Your logic if it's a full link--->
</cfif>
<br/>
</cfloop>
</cfoutput>
If it tries to return a full URL before the anchor as you say, (I've been getting inconsistent results with resolveurl="true") hit it with this to only grab the bit you want.
<cfoutput>
<cfloop array="#result#" item="item" index="index">
#ListLast(item, "##")#
</cfloop>
</cfoutput>
What this code does is grab all the URLs, and parse them for anchors.
You'll have to decide what to do next inside your loop. Maybe preserve the values and add them to a new array, so you can save it somewhere with the links fixed?
It's impossible to assume in a situation like this.
There does not appear to be a way to prevent CF from resolving the hashes. In our usage of it the current result is actually beneficial since when we present content from another site we usually want the user to be sent there.
Here is a way to replace link href values with just anchor if one is present using regular expressions. I'm sure there are combinations of issues that could occur here if really malformed html.
<cfsavecontent variable="testcontent">
<strong>test</strong>
go to google
go to section
</cfsavecontent>
<cfset domain = replace("current.domain", ".", "\.", "all") />
<cfset match = "(href\s*=\s*(""|'))\s*(http://#domain#[^##'""]+)(##[^##'""]+)\s*(""|')" />
<cfset result = reReplaceNoCase(testcontent, match, "\1\4\6", "all") />
<cfoutput><pre>#encodeForHTML(result)#</pre></cfoutput>
Output
<strong>test</strong>
go to google
<a href="#section>go to section</a>
Another option if you are displaying the content in a normal page with js/jquery available is to run through each link on display and update it to just be the anchor. This will be less likely error with malformed html. Let me know if you have any interest in that approach.
I have inherited a external page where I have no control:
I have javascript sorting on that page: like http://www.exampledomain.com/javascript:void(1);
Now it has many links like this, the 1 you see is dynamic, what I want to is: convert this code to ColdFusion URL like http://www.exampledomain.com/sor=1&sort=asc & desc. The 1 should work as it is, like it should keep its value as it is 1,2,3,4 etc. I tried to do this with jQuery.
How can I alter these links in ColdFusion?
I tried to come up with some of Javascript solution but it did not work
$('#container').find('a').attr('href', function(i, old) {
var col = decodeURIComponent(old).match(/javascript:\s*sort\((.*?)\)/)[1];
return hrefcall+data+'&sortBy='+col;
Thanks
Your question is unclear, the source of the data is unknown and there are a few typoes.
This JQuery (1.9.1)
var ihref = "";
var col = 'somedata';
$("a").click(function (i) {
ihref = $(this).attr('href');
if (ihref.match(/javascript:\s*sort\(\d+\).*/i)) {
ihref = ihref.replace(/javascript:\s*sort\((\d+)\).*/i,"http://www.w3schools.com/sor=$1&sortBy=" + col);
$(this).attr('href',ihref);
alert("As a demonstration, you\'ll see the link is rewritten when a javascript:sort url is clicked.");
}
});
Will convert all javascript:void links as you wanted. I did change the dummy url to w3schools.com because the good folks who own w3 permit their site to be loaded in Iframes which is necessary to easily demonstrate that this code works.
Of course, JQuery only works when JS is enabled. Still, since you started with JQuery, I thought I might show you a working demonstration.
(The links don't actually work, because w3schools doesn't have pages at those points, but you can see in the status bar, the links are rewritten).
If you're retrieving the page via cfhttp.filecontent, you can do something like this
<cfset cfhttp.filecontent = ReReplaceNoCase(cfhttp.filecontent,"javascript:\s*sort\((\d+)\);?","http://www.w3schools.com/sor=\1&sort=asc","ALL")>
The ReReplaceNoCase() was tested against this sample code..
<cfset cfhttp = {} />
<cfset col = "somedata" />
<cfsavecontent variable="cfhttp.filecontent">
test - will not alter url<Br />
test - will alter url<Br />
test - will alter url<Br />
test - will not alter url<Br />
</cfsavecontent>
<cfset cfhttp.filecontent = ReReplaceNoCase(cfhttp.filecontent,"javascript:\s*sort\((\d+)\);?","http://www.w3schools.com/sor=\1&sortBy=#col#","ALL")>
<cfdump var="#cfhttp#">
I am making a cfhttp get call to another page. I am passing the url variable using cfhttpparam as shown below. But when I run the page, the url is rendered as shown in the image. I need to replace %25 to be able to get the correct url string. Can someone tell me what is wrong with the code?
<cfset vpName = "Abc def F hig K xyz" /> I want %20 in the spaces in the name here. But it is rending as show in the image![enter image description here][1]
<cfset urlvar = URLEncodedFormat("#vpName#")>
<!--- <cfoutput>#urlvar#</cfoutput>
--->
<cfhttp url="https://abc.com/xyz/EM2/LTMR.cfm" method="get" username="abcd" password="password" >
<cfhttpparam type="url" name="LTMX" value="#urlvar#">
</cfhttp>
<cfset myDocument = cfhttp.fileContent>
<cfoutput>#myDocument#</cfoutput>
URL is rendered as
abc.com/LTMR.cfm?LTMX=Andre%2520Fuetsch%2520%2520F%2520Shelly%2520K%2520Lazzaro
The %25 is what the "URLEncodedformat()" is supposed to do - replace spaces (etc) with the appropriate encoded sequence - and as Peter said the <cfhttpparam> does this automatically - so you should change this;
<cfset urlvar = URLEncodedFormat("#vpName#")>
to be this...
<cfset urlvar = vpName/>
Although you could of course simply pass in the vpName instead of creating a completely separate variable for it.
I need to replace the text inside all href values. I think a regular expression is the way to do it, but I'm no regex pro. Any thoughts on how I'd do the following using ColdFusion?
so it is changed to:
Thanks!
Here's an update to the question: I have this code and need the pattern below:
<cfset matches = ReMatch('<a[^>]*href="http[^"]*"[^>]*>(.+?)</a>', arguments.htmlCode) /> <cfdump var="#matches#">
<cfset links = arrayNew(1)>
<cfloop index="a" array="#matches#">
<cfset arrayAppend(links, rereplace(a, 'need regex'," {clickurl}","all"))>
</cfloop>
<cfdump var="#links#">
Here's how to do it with jSoup HTML parser:
<cfset jsoup = createObject('java','org.jsoup.Jsoup') />
<cfset Dom = jsoup.parse( InputHtml ) />
<cfset Dom.select('a[href]').attr('href','{replaced}') />
<cfset NewHtml = Dom.html() />
(On CF9 and earlier, this requires placing the jsoup's jar in CF's lib directory, or using JavaLoader.)
Using a HTML parser is usually better than using regex, not least because it's easier to maintain and understand.
Here's an imperfect way of doing it with a regex:
<cfset NewHtml = InputHtml.replaceAll
( '(?<=<a.{0,99}?\shref\s{0,99}?=\s{0,99}?)(?:"[^"]+|''[^'']+)(["'])'
, '$1{replaced}$1'
)/>
Which hopefully demonstrates why using a tool such as jsoup is definitely the way to go...
(btw, the above is using the Java regex engine (via string.replaceAll), so it can use the lookbehind functionality, which doesn't exist in CF's built-in regex (rereplace/rematch/etc))
Update, based on the new code sample you've provided...
Here is an example of how to use jsoup for what you're doing - it might still need some updates (depending on what {clickurl} is eventually going to be doing), but it currently functions the same as your sample code is attempting:
<cfset jsoup = createObject('java','org.jsoup.Jsoup') />
<cfset links = jsoup.parse( Arguments.HtmlCode )
<!--- select all links beginning http and change their href --->
.select('a[href^=http]').attr('href',' {clickurl}')
<!--- get HTML for all links, then split into array. --->
.outerHtml().split('(?<=</a>)(?!$)')
/>
<cfdump var=#links# />
That middle bit is all a single cfset, but I split it up and added comments for clarity. (You could of course do this with multiple variables and 3+ cfsets if you preferred that.)
Again, it's not a regex, because what you're doing involves parsing HTML, and regex is not designed for parsing tag-based syntax, so isn't very good at it - there are too many quirks and variations with HTML and describing them in a single regex gets very complicated very quickly.