allow quotes in textfield coldfusion - regex

I'm building a form through which users will be able to submit articles. my current regex allows only certain characters and works as it should, though I do not know how to allow quotes as well. here is the code
<cfif refind ("[^A-Z a-z 0-9\\+\-\?\!\.\,\(\)]+", trim(form.articleText)) and len (trim(form.articleText)) gte 15>
<cfset msg = "The article can not contain special characters.">
</cfif>
I tried using &quot as in c# but it does not work!

Add quotes in your character class:
<cfif refind ("[^A-Za-z0-9 +?!.,()\\""'-]+", trim(form.articleText)) ...

anubhava's answer gives you what you've asked for, but the solution you probably need is actually something completely different: to use the ESAPI encodeForX functions in CF10 to encode the output appropriately for its context, such as encodeForHtml, instead of trying to restrict what characters can be written and constantly having to update it.
At most, you might want something such as:
<cfif refind('[[:cntrl:]]',form.articleText) >
<cfset msg = "The article can not contain control characters.">
</cfif>
Which will block unprintable control characters, whilst not preventing perfectly reasonable characters such as accented letters, currency symbols, and so on.

Related

Escaping + and % for email in Regular expression using ColdFusion

I am coding for Email validation which may take set of special characters. I could successfully add others to RegEx. However when I try for '+' and '%' it gives me an error.
I used '\' to allow special characters.
\+ --> adds a space removing + sign
\% --> removes 3rd char after % sign
ColdFusion has several built-in validation functions for things such as email addresses. You could simply use something like:
<cfif IsValid("email", YourEmailVar)>
<!--- do what you want for success here --->
<cfelse>
<!--- do what you want for validation failure here --->
</cfif>
Documentation for IsValid function
The IsValid function will also allow you to use RegEx if you prefer.
EDIT
In order to validate variables from the URL scope simply prepend that to the variable name. Like so:
<cfif IsValid("email", URL.YourURLEmailVar)>

How to get string of everything between these two em tag?

I want to get string between em tag , including other html also.
for example:
<em>UNIVERSALPOSTAL UNION - International Bureau Circular<br />
By: K.J.S. McKeown</em>
output should be as:
UNIVERSALPOSTAL UNION - International Bureau Circular<br />
By: K.J.S. McKeown
please help me.
Thanks
Use the regular expression function like this:
REMatch("(?s)<em>.*?</em>", html)
See also: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=regexp_01.html
The (?s) sets the mode to single line, so that the input text is interpreted as one line even if it contains line feeds. This is probably the default (I'm not sure) so it can be omitted. As Peter pointed out in a comment, this is not the default and therefore must be set.
The .*? matches all characters inbetween <em> and </em>. The questionmark after the multiplier makes it "non-greedy", so that as few as possible characters are matched. This is needed in case the input html contains something like <em>foo</em><em>bar</em> where otherwise only the outermost <em></em> tags are considered.
The returned array contains all matches found, i.e. all texts including html that was in <em> tags.
Note that this could fail for circumstances where </em> also occurs as attribute text and is incorrectly not html-encoded, for example: <em><a title="Help for </em> tag">click</a></em> or in other rare circumstances (e.g. javascript script tags etc.). A regex cannot replace a full HTML/XML parser and if you need 100% accurateness, you should consider using one: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_t-z_23.html
If your input is exactly in the format given above, you don't even need regex - just strip the outer tags:
<cfsavecontent variable="Input">[text from above]</cfsavecontent>
<cfset Output = mid( Input, 4 , len(Input) - 9 />
If your input is more than this (i.e. a significant piece of HTML, or a full HTML document), regex is still not the ideal tool - instead, you should be using a HTML parser, such as JSoup:
<cfset jsoup = createObject('java','org.jsoup.Jsoup') />
<cfset Output = jsoup.parse(Input).select('em').html() />
(With CF8, this code requires placing the jsoup JAR file in CF's lib directory, or using a tool such as JavaLoader.)
If you are using jquery you can do this also pretty easily.
$("em").html();
Will return all html between the em tags.
See this fiddle
I had to remove any text that was to follow after a partiucular tag . Now the HTML content was getting generated dynamically from a database that cater to 5 different langauges. so I only had the div tag to help me. I am not sure why REMatch("(?s).*?", html) did not work for me. However Ben helped me here (http://www.bennadel.com/blog/769-Learning-ColdFusion-8-REMatch-For-Regular-Expression-Matching.htm). My code looks like tghis:
<cfset extContentArr = REMatch("(?i)<div class=""inlineBlock"" style=""margin-right:30px;"">.+?</div>",qry_getContent.colval) />
<cfif !ArrayIsEmpty(extContentArr)>
Loop the array and do whatever you need with the extract , I just deleted them.
</cfif>

How to get rid of last comma when generating a list?

I'm writing a web service in ColdFusion. The problem is that I cannot figure out how to get rid of the comma after the last element. My code looks like this:
<cfoutput query="Attachments">
#url#,
</cfoutput>
Which produces output like this (notice the trailing comma)
url1,url2,url3,
How can I get rid of the trailing comma and produce this instead?
url1,url2,url3
This is an easy method:
<cfoutput>#ValueList(Attachments.url)#</cfoutput>
Jake's answer is what's needed in this particular case.
For more generic cases, you can do this:
<cfloop ...>
<cfset myList=listAppend(myList,value)>
</cfloop>
There's also a bit of trickery you can do since ColdFusion (by default) ignores empty list elements:
<cfset myList=arrayToList(listToArray(myList))>
Heck, even this'll work:
<cfset myList=listChangeDelims(myList , "," , ",")>
Of course, if you're not outputting the list as a string, you don't need to worry about that comma on the end since ColdFusion will just ignore the empty element. If you are outputting it as a string, here's yet another way to clean up that comma. It's not as reliable as the others, though.
<cfoutput>#left(trim(mylist),len(trim(mylist))-1)#</cfoutput>

What do you use to render text from <textarea> to <div> with ColdFusion?

What do you use? Replace() linebreak chars with <br>? what about spaces? like maybe replace 2 spaces with ?
ParagraphFormat() sucks.
paragraphformat2()? http://www.cflib.org/udf.cfm/paragraphformat2
ReplaceNoCase(someString, "\n", "<br>","all")
One thing you may have to take into account is that different OS treat line breaks differently. Windows uses CR/LF while OS X and Unix use CR. I have used a code block effectively in the past to manage the different possibilities when it comes to reading in text files. Same principles could apply here. It's not 100% perfect, but on the rare occasions it has failed me it was because of an odd method of how the file was created. I modified it to fit the general idea of what you are going after.
<cfset variables.CRLF = findnocase(variables.textFromTextarea,"#chr(10)#") />
<cfif variables.CRLF>
<cfset variables.textFromTextarea = replaceNoCase(variables.textFromTextarea,"#chr(10)#","<br>","all") />
<cfelse>
<cfset variables.textFromTextarea = replaceNoCase(variables.textFromTextarea,"#chr(13)#","<br>","all") />
</cfif>
The idea here is that you are looking for the windows-only LF. If found, replace on it. Otherwise replace on the CR. Maybe that will work for you.
I needed to replace text from a textarea input to html when output on a webpage, and preserved the line break. So based on the accepted answer, I simply modified it to this, and it worked:
ReplaceNoCase(someString, Chr(10), "<br />","all")
Hope that helps anyone else.

Calling a function within ReReplace function

Is there a way to write in coldfusion something like:
< cfset ReReplace(value,"&#\d+;","#decodeHtmlEntity(\1)#", "all") >
Thanks a lot
The short answer is "No".
CF doesn't handle the regular expression execution natively. It hands off to a Java library (Oro, IIRC) to handle that. This means that any CF functions you call get executed before toe regex.
There is a workaround, although it's not nearly as elegant as being able to pass functions would be. Use reFind() to discover all the instances of what you are looking for, and repolace them one-by-one. If you do the replaces last-to-first (eg if there are 3 instances, do the 3rd, then the 2nd, then the 1st) your starting point for each match will remain in the same location, so you can do an reFind all, instead of doing the reFind in the loop.
HTH.
I don't think this will work if you want to replace regular expression value as argument of decodeHTMLEntity.
Updated:
<cfset myVar = ReReplace("ABC123DEF","(\d+)",addOne('\1'), "all") >
<cffunction name="addOne" access="public" output="false" returntype="string">
<cfargument name="arg1" required="true" type="string" />
<cfreturn arg1 + 1>
</cffunction>
<cfdump var="#myvar#">
Above code written to find 123 from text and add one into it but this will not work as arg1 will have \1 which is not numeric value.
Have you tried simply using URLDecode(value)?
Or if you specifically only want to decode the numeric html codes, then
<cfset myVar = ReReplace(value,"(&##[\d+];)",urlDecode('\1'), "all") >
will do what you need.
To explain what it is doing :
I've replaced the PHP decodeHTMLEntity function with the CFML version.
If you want to use back references you need to specify the capture groups in the regex pattern.
you need to double up those #'s to escape them, otherwise CF will be looking for a close # that it will never find.