<cfoutput>
<cfsavecontent variable="s">
This is some text. It is true that Harry Potter is a good
</cfsavecontent>
<cfset matches = reMatch("<[aA].*?>",s) />
#matches#
</cfoutput>
I need to get only "http://www.cnn.com" how to do this
#Bhargavi : Your regex is fine.
Try #matches[1]# instead.
<cfdump var="#matches#"> will show you the array of values on which the results can be manipulated accordingly.
Related
New to coldfusion, new to regex...
I have a directory of files, named with "some" followed by a 13digit number, followed by underscore, ID and file ending like so:
some0000000000000_ID.jpg
ID can be any string.
How would I get the ID using regex? I guess I'd be looking for something like this, which captures everything between the underscore and file ending dot:
_\A[A-Z]*[a-z]*[0-9]*$
but I'm really not getting anywhere. Can someone point me in the right direction?
Thanks!
EDIT:
I ended up doing it like this, which is hack-ish but works nicely:
<cfset cropFront = #ListRest(ReReplaceNoCase(name, ".png|.jpg", ""), "_")#>
<cfset cropFull = #ListFirst(ReReplaceNoCase( cropFront, "xxxxx", ""), "." )#>
Maybe useful for someone else, too!
<cfdirectory name="images" directory="#path#" filter="some?????????????_ID.jpg">
The filter is not a regex pattern. It only knows the ? and * wildcard characters.
Can't test at the moment but this is the idea...
<cfdirectory name="files" directory="path" action="list" />
<cfloop query="files">
<cfset findinfo = refind("^some(\d{13})_", files.name, 0, true) />
<cfif arraylen(findinfo.pos) eq 2>
<cfset fileid = mid(files.name, findinfo.pos[2], findinfo.len[2]) />
<!--- do something --->
</cfif>
</cfloop>
I'm doing some web scraping with ColdFusion and mostly everything is working fine. The only other issues I'm getting is that some URL's come through with text behind them that is now causing errors.
Not sure what's causing it, but it's probably my regex. Anyhow, there's a distinct pattern where text appears before the "http://". I'd like to simply remove everything before it.
Any chance you could help?
Take this string for example:
"I'M OBSESSED WITH MY BEAUTIFUL FRIEND" src="http://scs.viceland.com/feed/images/uk_970014338_300.jpg
I'd much appreciate your help as regex isn't something I've managed to make time for - hopefully I will some day!
Thanks.
EDIT:
Hi,
I thought it might be helpful to post my entire function, since it could be my initial REGEX that is causing the issue. Basically, the funcion takes one argument. In this case, it's the contents of a HTML file (via CFHTTP).
In some cases, every URL looks and works fine. If I try digg.com for example it works...but it dies on something like youtube.com. I guess this would be down to their specific HTML formatting. Either way, all I ever need is the value of the SRC attribute on image tags.
Here's what I have so far:
<cffunction name="extractImages" returntype="array" output="false" access="public" displayname="extractImages">
<cfargument name="fileContent" type="string" />
<cfset var local = {} />
<cfset local.images = [] />
<cfset local.imagePaths = [] />
<cfset local.temp = [] />
<cfset local.images = reMatchNoCase("<img([^>]*[^/]?)>", arguments.fileContent) />
<cfloop array="#local.images#" index="local.i">
<cfset local.temp = reMatchNoCase("(""|')(.*)(gif|jpg|jpeg|png)", local.i) />
<cfset local.path = local.temp />
<cfif not arrayIsEmpty(local.path)>
<cfset local.path = trim(replace(local.temp[1],"""","","all")) />
<cfset arrayAppend(local.imagePaths, local.path) />
</cfif>
<cfif isValid("url", local.path)>
<cftry>
<cfif fileExists(local.path)>
<cfset arrayAppend(local.imagePaths, local.path) />
</cfif>
<cfcatch type="any">
<cfset application.messagesObject.addMessage("error","We were not able to obtain all available images on this page.") />
</cfcatch>
</cftry>
</cfif>
</cfloop>
<cfset local.imagePaths = application.udfObject.removeArrayDuplicates(local.imagePaths) />
<cfreturn local.imagePaths />
</cffunction>
This function WORKS. However, on some sites, not so. It looks a bit over the top but much of it is just certain safeguards to make sure I get valid image paths.
Hope you can help.
Many thanks again.
Michael
Take a look at ReFind() or REFindNoCase() - http://cfquickdocs.com/cf9/#refindnocase
Here is a regex that will work.
<cfset string = 'IM OBSESSED WITH MY BEAUTIFUL FRIEND" src="http://scs.viceland.com/feed/images/uk_970014338_300.jpg' />
<cfdump var="#refindNoCase('https?://[-\w.]+(:\d+)?(/([\w/_.]*)?)?',string, 1, true)#">
You will see a structure returned with a POS and LEN keys. Use the first element in the POS array to see where the match starts, and the first element in the LEN array to see how long it is. You can then use these values in the Mid() function to grab just that matching URL.
I'm not familiar with ColdFusion, but it seems to me that you just need a regex that looks for http://, then any number of characters, then the end of the string.
I would like to achieve something I can easily do in .net.
What I would like to do is pass multiple URL parameters of the same name to build an array of those values.
In other words, I would like to take a URL string like so:
http://www.example.com/Test.cfc?method=myArrayTest&foo=1&foo=2&foo=3
And build an array from the URL parameter "foo".
In .net / C# I can do something like this:
[WebMethod]
myArrayTest(string[] foo)
And that will build a string array from the variable "foo".
What I have done so far is something like this:
<cffunction name="myArrayTest" access="remote" returntype="string">
<cfargument name="foo" type="string" required="yes">
This would output:
1,2,3
I'm not thrilled with that because it's just a comma separated string and I'm afraid that there may be commas passed in the URL (encoded of course) and then if I try to loop over the commas it may be misinterpreted as a separate param.
So, I'm stumped on how to achieve this.
Any ideas??
Thanks in advance!!
Edit: Sergii's method is more versatile. But if you are parsing the current url, and do not need to modify the resulting array, another option is using getPageContext() to extract the parameter from the underlying request. Just be aware of the two quirks noted below.
<!--- note: duplicate forces the map to be case-INsensitive --->
<cfset params = duplicate(getPageContext().getRequest().getParameterMap())>
<cfset quasiArray = []>
<cfif structKeyExists(params, "foo")>
<!--- note: this is not a *true* CF array --->
<!--- you can do most things with it, but you cannot append data to it --->
<cfset quasiArray = params["foo"]>
</cfif>
<cfdump var="#quasiArray#">
Well, if you're OK with parsing the URL, following "raw" method may work for you:
<cffunction name="myArrayTest" access="remote" output="false">
<cfset var local = {} />
<!--- parse raw query --->
<cfset local.args = ListToArray(cgi.QUERY_STRING, "&") />
<!--- grab only foo's values --->
<cfset local.foo = [] />
<cfloop array="#local.args#" index="local.a">
<cfif Left(local.a, 3) EQ "foo">
<cfset ArrayAppend(local.foo, ListLast(local.a, "=")) />
</cfif>
</cfloop>
<cfreturn SerializeJSON(local.foo) />
</cffunction>
I've tested it with this query: ?method=myArrayTest&foo=1&foo=2&foo=3,3, looks to work as expected.
Bonus. Railo's top tip: if you format the query as follows, this array will be created automatically in URL scope ?method=myArrayTest&foo[]=1&foo[]=2&foo[]=3,3.
listToArray( arguments.foo ) should give you what you want.
is there a way to set a variable with cfset that acts more like a cdata tag
or is there another way of having a page with some basic variables set and a couple of longer variables set for the main content;
ie.
<cfoutput>
<CFSET page_title = "TITLE">
<CFSET examplevariable = "ABC">
<CFSET content>
<!--something like this-->
<div>
bunch of content without any cf tags
</div>
</CFSET>
<cfinclude template="include/layout.cfm">
</cfoutput>
<cfsavecontent variable="header">
<cfoutput>
I can be HTML, javascript anything text.
remember to escape pound sysmbols ie: ##FF0000 instead of #FF0000
I can even <cfinclude template="headerpage.cfm"> and will be stored in variable
called header
</cfoutput>
</cfsavecontent>
<cfoutput>#header#</cfoutput>
My client has a database table of email bodies that get sent at certain times to customers. The text for the emails contains ColdFusion expressions like Dear #firstName# and so on. These emails are HTML - they also contain all sorts of HTML mark-up. What I'd like to do is read that text from the database into a string and then have ColdFusion Evaluate() that string to resolve the variables. When I do that, Evaluate() throws an exception because it doesn't like the HTML markup in there (I also tried filtering the string through HTMLEditFormat() as an intermediate step for grins but it didn't like the entities in there).
My predecessor solved this problem by writing the email text out to a file and then cfincluding that. It works. It's seems really hacky though. Is there a more elegant way to handle this using something like Evaluate that I'm not seeing?
What other languages often do that seems to work very well is just have some kind of token within your template that can be easily replaced by a regular expression. So you might have a template like:
Dear {{name}}, Thanks for trying {{product_name}}. Etc...
And then you can simply:
<cfset str = ReplaceNoCase(str, "{{name}}", name, "ALL") />
And when you want to get fancier you could just write a method to wrap this:
<cffunction name="fillInTemplate" access="public" returntype="string" output="false">
<cfargument name="map" type="struct" required="true" />
<cfargument name="template" type="string" required="true" />
<cfset var str = arguments.template />
<cfset var k = "" />
<cfloop list="#StructKeyList(arguments.map)#" index="k">
<cfset str = ReplaceNoCase(str, "{{#k#}}", arguments.map[k], "ALL") />
</cfloop>
<cfreturn str />
</cffunction>
And use it like so:
<cfset map = { name : "John", product : "SpecialWidget" } />
<cfset filledInTemplate = fillInTemplate(map, someTemplate) />
Not sure you need rereplace, you could brute force it with a simple replace if you don't have too many fields to merge
How about something like this (not tested)
<cfset var BaseTemplate = "... lots of html with embedded tokens">
<cfloop (on whatever)>
<cfset LoopTemplate = replace(BaseTemplate, "#firstName#", myvarforFirstName, "All">
<cfset LoopTemplate = replace(LoopTemplate, "#lastName#", myvarforLastName, "All">
<cfset LoopTemplate = replace(LoopTemplate, "#address#", myvarforAddress, "All">
</cfloop>
Just treat the html block as a simple string.
CF 7+: You may use regular expression, REReplace()?
CF 9: use Virtual File System
If the variable is in a structure from, something like a form post, then you can use "StructFind". It does exactly as you request. I ran into this issue when processing a form with dynamic inputs.
Ex.
StructFind(FORM, 'WhatYouNeed')