Is validate delimiters with ColdFusion possible? - coldfusion

My CF application provide three selections (semicolon, comma or tab) for users to choose to match the delimiters they have in their file. I want to validate what users selected with what delimiter they have in their file. Is there a way to do this?
So if user is using tab delimiters for his text file but he accidentally selected a comma then I will get this error:
Invalid list index 2.
In function ListGetAt(list, index [, delimiters]), the value of index, 2, is not a valid as the first argument (this list has 1 elements). Valid indexes are in the range 1 through the number of elements in the list.
I think the only way to avoid this type of error is if I can validate user's delimiters being used in their file but I could not find any example when I searched the web.

You didn't specify what kind of data is delimited in the file, so here's just a very simple guessing method:
<!--- read file into memory --->
<cfset fileContent = fileRead( expandPath("yourfile.csv") )>
<!--- declare delimiting characters to check, NOTE: due to using "listLen" you may only specify single characters --->
<cfset possibleDelimiters = [ ";", ",", chr(9) ]> <!--- chr(9) = tab --->
<!--- count number of records found for each delimiter --->
<cfset countResults = {}>
<cfloop array="#possibleDelimiters#" index="delimiter">
<cfset countResults[delimiter] = listLen(fileContent, delimiter)>
</cfloop>
<!--- determine delimiter with the highest count --->
<cfset sortedDelimiters = structSort(countResults, "NUMERIC", "DESC")>
<cfset mostFrequentDelimiter = sortedDelimiters[1]>
<cfoutput>
Is <code>#encodeForHtml(mostFrequentDelimiter)# (#asc(mostFrequentDelimiter)#)</code> the delimiter?
</cfoutput>
However, this will guess terribly if you have text paragraphs in your file due to the frequency of commas in most written languages, so take it with a grain of salt.

Related

ColdFusion Reg expression

I have a csv file composed from a header line (field list) and several detalis lines (details for each Customer).
This file contain a lot of unused fileds so I tried to rebuild a clean file in which I put only fields I need.
for that I loop over the header line, I get the index (position) of the field I need and stored it in a variable:
<cfset FirstName_Pos = listfind(header,'FirstName',';')>
<cfset LastName_Pos = listfind(header,'LastName',';')>
The separated header fields is the ';' character.
After that I retreive the positions of all needed fields, I create a new file to put in the desired info of each line
<cffile action="rename" source="#LocalPath#/#FileName#" destination="#LocalPath#/Source_#FileName#">
<cfset NewFile = FileOpen('#LocalPath#/#FileName#','Append')>
<cfset Newheader = 'FirstName,LastName'>
<cfset fileWriteLine(NewFile, Newheader)>
<cfloop file="#LocalPath#\Source_#FileName#" index="line">
<cfset count = count + 1>
<cfif count GTE 2>
<cfset FirstName= listgetat(line,FirstName_Pos,';',1)>
<cfset LastName= listgetat(line,LastName_Pos,';',1)>
<cfset detail = '#FirstName#,#LastName#'>
<cfset fileWriteLine(NewFile, detail)>
</cfif>
The problem is that in the details lines of the original file there are some fields written as follow :
"#08/04/14 23:00;08/05/14 23:00#"
i.e the field contains the ';' character which is my separated fields character I used in the listgetat function
Therefore, I get non desired value in the variable FirstName and LastName.
Considering that the original file contain the following info:
USERID;Post;FirstName;Date1;Mail;Date2;LastName;Telephone
123;Engineer;Alan;"#08/04/14 23:00;08/05/14 23:00#";alan#yahoo.fr;"#10/04/14 11:00;10/05/14 11:00#";Jones;0624262589
I get :
FirstName;LastName
Alan;"#10/04/14 11:00
instead of
FirstName;LastName
Alan;Jones
I get the idea to loop over all details line of the original file and replace the ';' charcater with a space or blank character using regular expression only on fields having the same format "#08/04/14 23:00;08/05/14 23:00#".
(The date change of course from one field to another and from one raw to another)
<cfloop file="#LocalPath#\Source_#FileName#" index="line">
<cfset newline = rereplace(line,'"##[^\w.];[^\w.]##"','"##[^\w.] [^\w.]##"','all')>
<cfset count = count + 1>
<cfif count GTE 2>
<cfset FirstName= listgetat(newline,FirstName_Pos,';',1)>
<cfset LastName= listgetat(newline,LastName_Pos,';',1)>
<cfset detail = '#FirstName#,#LastName#'>
<cfset fileWriteLine(NewFile, detail)>
</cfif>
</cfloop>
It doesn't work because it seems that the regular expression I used is completely wrong. And also maybe because I duplicate the # sign to deal with coldfusion syntax error
Can anyone has an idea about the regular expression I have to used to deal with this situation?
Thanks in advance
this is an example of an original file
USERID;Post;FirstName;Date1;Mail;Date2;LastName;Telephone
123;Engineer;Alan;"#08/04/14 23:00;08/05/14 23:00#";alan#yahoo.fr;"#10/04/14 11:00;10/05/14 11:00#";Jones;0624262589
parse the original file and replace all occurrences of 0;0 (number;number) with 0,0 (number,number). Then your original solution should work fine.
regex = "/d;/d" to track them down I believe.

Paired list with different delimiters

I have a list which I want to split out and insert into a table. The list contains paired names and values:
R0006^^1.00000000~~R0042^^1.00000000~~R0049^^1.00000000~~R0072^^1.00000000~~R0088^^3.00000000~~R0092^^1.00000000~~R0106^^1.00000000
How can I loop over this list and insert the names and values into a database table as I am struggling to get in my head the use of different delimiters and their associated values.
Many Thanks
JC
ColdFusion's tags & functions don't fully deal with multi-character delimiters very well. ArrayToList() supports multiple delimiters but most most other list-related functions do not.
If your data never contains a ~ or ^ by itself, I would take advantage of that and replace the 2-length delimiters with one-length delmiters.
(Edit: As Leigh points out in comments, a ReReplace, or ReplaceList() is not needed in this case as CF ignores empty elements by default). It won't change the output to remove it, but that's the point, having it there isn't doing anything useful either. Commented out for clarity of point.)
<cfset dList = "R0006^^1.00000000~~R0042^^1.00000000~~R0049^^1.00000000~~R0072^^1.00000000~~R0088^^3.00000000~~R0092^^1.00000000~~R0106^^1.00000000" />
<!---cfset dList = ReReplace(dList,"(~|\^)\1","\1","ALL")--->
<cfset dArray = ListToArray(dList,"~",false) />
<cfloop array="#dArray#" index="a1">
<cfquery...>
insert into mytable(lname,lvalue)
values(<cfqueryparam value="#listfirst(a1,"^")#">,<cfqueryparam value="#listlast(a1,"^^")#">)
</cfquery>
</cfloop>
The nice part about this is that it has pretty good backwards compatibility as well.
However, this does assume that each item in the ~ delimited list has two sub-items. If it does not, and only has the field label, you can do this.
<cfset dList = "R0006^^1.00000000~~R0042^^1.00000000~~R0049^^1.00000000~~R0072^^1.00000000~~R0088^^3.00000000~~R0092^^1.00000000~~R0106^^1.00000000" />
<!---cfset dList = ReReplace(dList,"(~|\^)\1","\1","ALL")--->
<cfset dArray = ListToArray(dList,"~",false) />
<cfloop array="#dArray#" index="a1">
<cfquery...>
insert into mytable(lname,lvalue)
values(<cfqueryparam value="#listfirst(a1,"^")#">,<cfqueryparam value="#(listlen(a1,"^") gt 1 ? listlast(a1,"^") : "")#">)
</cfquery>
</cfloop>
Finally, as David Faber points out in the comments, you can use ReplaceList(dlist, "~~,^^", "~,^") instead of ReReplace(dList,"(~|\^){2}","\1","ALL") which will achieve the same goal but has the added benefit of being easier to read for people who may not be comfortable with Regular Expressions.
I would take a slightly different approach (for the reasons stated in my comments ... what if the character sequences ~^ or ^~ exist in the data, or even the single characters ^ or ~?) and turn the list into JSON, then deserialize it into a struct:
<cfset the_list = "R0006^^1.00000000~~R0042^^1.00000000~~R0049^^1.00000000~~R0072^^1.00000000~~R0088^^3.00000000~~R0092^^1.00000000~~R0106^^1.00000000" />
<!--- Escape characters that need to be escaped --->
<cfset the_list = replace(the_list, "\", "\\", "all") />
<cfset the_list = replace(the_list, """", "\""", "all") />
<cfset the_list = replace(the_list, "^^", """:""", "all") />
<cfset the_list = "{""" & replace(the_list, "~~", """,""", "all") & """}" />
<cfset the_coll = deserializeJSON(the_list) />
The only difficulty with the above would be if there were duplicate keys. In that case one might use an array of structs - this can be accomplished simply by changing the line replacing the double tilde ~~:
<cfset the_list = "[{""" & replace(the_list, "~~", """},{""", "all") & """}]" />
Then loop over the array to insert into the database.

Coldfusion: How to split a string into a set of variables

I'm trying to teach myself ColdFusion.
I have a string coming in from the database in this format:
domain.com
<br/>
www.facebook.com/facebookpage
<br/>
http://instagram.com/instagrampage
It is all coming from #getRacer.txtDescription#. The format of this text will always be the same.
I need to split it into 3 variables. I tried this (derived from the example on the adobe website)
<h3>ListToArray Example</h3>
<cfset myList = ValueList(getRacer.txtDescription)>
<p>My list is a list with <cfoutput>#ListLen(myList)#</cfoutput> elements.
<cfset myArrayList = ListToArray(myList,'<br/>')>
<p>My array list is an array with
<cfoutput>#ArrayLen(myArrayList)#</cfoutput> elements.
I somehow ended up with 11 items in the array.
Thank you
This should work.
<cfset TestSTring = "domain.com<br/>www.facebook.com/facebookpage<br/>http://instagram.com/instagrampage">
<cfset a = TestString.Split("<br/>")>
The reason ListtoArray is displaying 11 items is because ColdFusion treats each character in the delimiter string (<br/>) as a separate delimiter
Based on #Leigh's comment updating my answer to ensure people should learn the Coldfusion APIs rather than fiddling with Java functions, <cfset a = ListToArray(TestString, "<br/>", false, true)> will also work. Thanks Leigh.
Note: The false at the end is for the includeEmptyFields flag and the true is for the multiCharacterDelimiter flag. See the docs.
<cfset myList = ReplaceNoCase(getRacer.txtDescription,'<br/>','|','ALL')>
<cfset myArrayList = ListToArray(myList,'|')>
I chose a pipe character because it is unlikely to already exist in your string. If you wanted to account for the possibility that your BR tag may or may not use XML syntax then you could you regex:
<cfset myList = ReReplaceNoCase(str,'<br/?>','|','ALL')>
<cfset myArrayList = ListToArray(myList,'|')>

How to split a list separated by ";" into sub lists in ColdFusion

I need to split one list, delimited by ;, into multiple sub lists. Can I do it without converting it into an array in ColdFusion?
Example: My_list contains:
[10043,10044,10045,10046:2,5,3,1;3453,2167:1,0;2346,8674,9043,7543,6453:0,4,2,0,1]
I need:
My_list1 = [10043,10044,10045,10046:2,5,3,1]
My_list2 = [3453,2167:1,0]
My_list3 = [2346,8674,9043,7543,6453:0,4,2,0,1]
... and so on.
You don't need to "do" anything. A list is just a delimited string. So if you want to set those (very poorly named, IMO) variables, it's just a matter of:
<cfset fullList = "10043,10044,10045,10046:2,5,3,1;3453,2167:1,0;2346,8674,9043,7543,6453:0,4,2,0,1">
<cfset varIndex = 0>
<cfloop index="subList" list="#fullList#" delimiters=";">
<cfset "My_list#++varIndex#" = subList>
</cfloop>
<cfdump var="#variables#">
I seriously wouldn't use dynamic variable names like that though, I'd use an array.

When should I use # in ColdFusion?

This has been one of the biggest obstacles in teaching new people ColdFusion.
When to use # is ambiguous at best. Since using them doesn't often create a problem it seems that most people gravitate to using them too much.
So, what are the basic rules?
I think it may be easier to say where NOT to use #. The only place is in cfif statements, and cfset statements where you are not using a variable to build a string in quotes. You would need to use the # sign in almost all other cases.
Example of where you are not going to use it:
<cfset value1 = 5>
<cfset value2 = value1/>
<cfif value1 EQ value2>
Yay!!!
</cfif>
<cfset value2 = "Four plus one is " & value1/>
Examples of where you will use #:
in a cfset where the variable is surrounded by quotes
<cfset value1 = 5>
<cfset value2 = "Four plus one is #value1#"/>
the bodies of cfoutput, cfmail, and cffunction (output="yes") tags
<cfoutput>#value2#</cfoutput>
<cfmail to="e#example.com" from="e#example.com" subject="x">#value2#</cfmail>
<cffunction name="func" output="yes">#value2#</cffunction>
in an attribute value of any coldfusion tag
<cfset dsn = "myDB"/>
<cfquery name="qryUsers" datasource="#dsn#">
<cfset value1 = 5>
<cfset value2 = 10/>
<cfloop from="#value1#" to="#value2#" index="i">
<cfqueryparam value="#value1#" cfsqltype="cf_sql_integer"/>
EDIT:
One oddball little thing I just noticed that seems inconsistent is conditional loops allow the variable name to be used with and without # signs.
<cfset value1 = 5>
<cfloop condition = "value1 LTE 10">
<cfoutput>#value1#</cfoutput><br>
<cfset value1 += 1>
</cfloop>
<cfset value1 = 5>
<cfloop condition = "#value1# LTE 10">
<cfoutput>#value1#</cfoutput><br>
<cfset value1 += 1>
</cfloop>
Here's what Adobe has to say about it:
Using number signs
String interpolation:
<cfset name = "Danny" />
<cfset greeting = "Hello, #name#!" />
<!--- greeting is set to: "Hello, Danny!" --->
Auto-escaped string interpolation in cfquery:
<cfset username = "dannyo'doule" ?>
<cfquery ...>
select u.[ID]
from [User] u
where u.[Username] = '#username#'
</cfquery>
<!--- the query is sent to the server (auto-escaped) as: --->
<!--- select u.[ID] from [User] u where u.[Username] = 'dannyo''doule' --->
<!--- note that the single-quote in the username has been escaped --->
<!--- by cfquery before being sent to the database server --->
Passing complex arguments/attributes in CFML:
<cfset p = StructNew() />
<cfset p.firstName = "Danny" />
<cfset p.lastName = "Robinson" />
<cfmodule template="modules/view/person.cfm" person="#p#">
<!--- the variable Attributes.person will be --->
<!--- available in modules/view/person.cfm --->
Passing complex arguments requires # signes only in CFML, not CFScript. In addition, you can pass any kind of value: simple values, arrays, structs, cfcomponents, cffunctions, java objects, com objects, etc.
In all these cases, the text between the # signs does not have to be the name of a variable. In fact, it can by any expression. Of course, for string interpolation, the expression must evaluate to a simple value, but for argument/attribute passing in CFML, the expression may evaluate to any complex value as well.
The #...# syntax allows you to embed an expression within a string literal. ColdFusion is unfortunately pretty inconsistent about what's a string and what's an expression. Jayson provided a good list of examples of when to use (or not use) #s.
At the risk of sounding like a wise-guy, a rule of thumb is: use # around variables or expressions only when not doing so doesn't yield the correct result. Or: if you don't need them, don't use them.
I like Jayson's answer though.
Let's start by assuming you aren't talking about cfoutput tags, cause then the answer is always, elsewhere in your code, if you are inside of quotation marks, then need to use # symbols if it's possible to actually type the value that is going to be used...so if you are in a cfloop tag setting the 'to' attribute, you could easily type 6, but if you want to use a variable you need to use the # symbols. Now if you are in a cfloop tag setting the query parameter, there is no way you could actually type the query into that attribute, there is no way to type a query, so no # symbols are needed.
Likewise in a cfdump tag, you can dump static text, so if you want to dump the contents of a variable, then you will need to use a # symbol. This problem is generally self-correcting, but I feel your pain, your students are probably frustrated that there is no "ALWAYS USE THEM" or "NEVER USE THEM" approach...sadly this isn't the case, the only thing that is true, is only one way inside of quotation marks is going to be correct. So if it isn't working look at it hard and long and think to yourself: "Could I type that value out instead of using the value contained in that variable?" If the answer is no, then the # symbols won't be needed, otherwise get your # character foo on.