CFSEARCH Solr Collection limitations - coldfusion

I have migrated a Verity based CFSEARCH solution over to Solr based CFSEARCH solution and finding that Solr is not returning all results when searching against multiple collections. I am going to work around this via running separate CFSEARCH calls and consolidating my results but wanted to know if a better work around exists that would allow things to work via just one CFSEARCH call. Code that does not return the proper results is pretty simple:
<CFsearch NAME="EMCSearch"
COLLECTION="apropos,certegy,cmco,conoco,contracts,corpbadge_pki,cust_train,delft_dc,documentation_help,dsvcs,grti,gts,infosys,mgmt_tools,pers,processes,scc,sd,slb,srv_desc,tips,voip,WAN_Work_Procedures,west"
CRITERIA="#LCase(searchfor)#">
That returns a record count of 23 results. If I however change things to this I get a combined record count of 76:
<cfset lstCols = "apropos,certegy,cmco,conoco,contracts,corpbadge_pki,cust_train,delft_dc,documentation_help,dsvcs,grti,gts,infosys,mgmt_tools,pers,processes,scc,sd,slb,srv_desc,tips,voip,WAN_Work_Procedures,west" />
<cfloop list="#Variables.lstCols#" index="Col">
<CFsearch NAME="EMCSearch"
COLLECTION="#Col#"
CRITERIA="#LCase(searchfor)#">
</cfloop>

Related

cfloop vs cfoutput on queries

I develop using ColdFusion and wanted to know what is the best strategy to loop over large query result set. Is there any performance difference between using cfloop and cfoutput? If not, is there any reason to prefer one over the other?
I believe that there used to be. I think this difference has been tackled, the best bet is to do a test for each to test in you specific use case.
<cfset t = GetTickCount()/>
<cf... query="qry">
<!--- Do something --->
</cf...>
<cfset dt = GetTickCount() - t/>
<cfdump var="#dt#"/>
<!---
If the differences are small you can use java.lang.System.nanoTime() instead
--->
There are some notable differences though. cfoutput can do grouped loops, which cfloop cannot.
<cfoutput query="qry" group="col">
<!--- Loops once for each group --->
<cfoutput>
<!--- Loops once for each record within the group --->
</cfoutput>
</cfoutput>
For cfoutput you can specify the startrow and the maxrows (or the count) to paginate your result. For cfloop you have to specify the endrow index instead of the count.
Also you cannot use cfoutput for a query nested within an existing cfoutput tag, you will need to end the containing cfoutput first.
One good reason to use cfloop instead of cfoutput is if you need to loop a query output within another query output cfoutput does not support nested query outputting. You can however get away with it using cfloops. So:
<cfoutput query="test1">
#test1ID#
<cfoutput query="test2">
#test2ID#
</cfoutput>
</cfoutput>
does not work, but if you replace the cfoutputs with cfloops, it will.
As of CF10, with the ability to group cfloops, that's the only remaining functional difference. They both perform the same.
I believe it's all the same as performance, Ben Forta
And the rest is pretty much personal preference as far as how you "like" to work with your loop. Keep in mind you should always scope your variables, but inside a cfoutput loop that would be especially important since the query fields "could" be referenced without referring to their scope.
one reason you may prefer the cfloop approach would be if you needed to "escape" cfoutput during your loop for any reason. I have run into that several times, so I generally prefer cfloop.
There wouldn't be a performance difference using either method, it depends on your coding style really. If you put a <cfoutput> at the top and bottom of every page then using <cfloop> will work great. If you use multiple <cfoutput> and only place where they are needed that works as well.
I personally put <cfoutput> only where they are necessary, but I wouldn't say that's more correct than placing them at the top and bottom of the page.

Unable to display the data returned by a sql function in ColdFusion

We are trying to display a record set in a webform using the cffunction in ColdFusion.
But we are unable to display the ouptut..We are just getting a single value(last value in the record set).The sql function is returning ref cursor.Please help us resolve this issue!
thanks!
<cfset bare_drive_result = functionname('x','y')>
<cffunction name="functionname" hint="Gets all customer from the database" returntype="query">
<cfargument name="sPcwQua" type=any>
<cfargument name="sPcwAcc" type=any>
<cfquery name="getRkBareDrive" datasource= "#PCW_dsn#">
select pacakagename.functionname('#sPcwQua#','#sPcwAcc#') bare_drive_result
from dual
</cfquery>
<cfreturn getRkBareDrive>
</cffunction>
<cfoutput>#getRkBareDrive.bare_drive_result#</cfoutput>
You have a scoping issue. The name of the query inside your method is getRKBareDrive, however, you are setting the result of that function call to another variable name - bare_drive_result.
The code would actually be:
<cfoutput query="bare_drive_result">#bare_drive_result.bare_drive_result</cfoutput>
And inside your function, you need to add this line just after the cfargument tags:
<cfset var getRkBareDrive = "" />
However, that does not solve your immediate problem. Have you tried doing:
<cfdump var="#bare_drive_result#" />
to see what is actually returned to ColdFusion
Everything #Scott Stroz said in his answer is true - 1) you are confusing variable names between the code and the inline function, 2) use a var scope to define the getRKBareDrive variable inside your function (not directly part of the issue being an inline function, but good practice), and 3) try to CFDUMP the result instead of CFOUTPUT it. However I don't believe the core issue is addressed, which lies with this part of your question:
The sql function is returning ref cursor
So one glaring issue is that you are returning a REFCURSOR not a simple value, so you cannot just CFOUTPUT it but instead need to take certain steps so ColdFusion knows it is a query set. REFCURSOR can be returned in calls from Stored Procedures via CFPROCRESULT (or via CFPROCPARAM type="OUT" CFSQLType="CF_SQL_REFCURSOR" if it's a returned parameter not a result).
So... try coverting the CFQUERY call to a CFSTOREDPROC. Here's some sample code that assumes you can call your package/function directly as a stored proc vs through a query. I removed the inline function as it's adding too much complexity here (again see Scott's answer) -- just try bare code as a way to get the call working.
<cfstoredproc procedure="pacakagename.functionname" datasource= "#PCW_dsn#">
<cfprocparam type="IN" CFSQLType="CF_SQL_VARCHAR", value="x">
<cfprocparam type="IN" CFSQLType="CF_SQL_VARCHAR", value="y">
<cfprocresult name="bare_drive_result" >
</cfstoredproc>
<cfdump var="#bare_drive_result#">
If you see results in the dump, you should be able replace the dump and output the fields within the ref cursor just like a normal CFQUERY result within a CFOUTPUT query="bare_drive_result" call.
From CF8 documentation on CFPROCRESULT:
CFML supports Oracle 8 and 9 Reference Cursor type, which passes a
parameter by reference. Parameters that are passed this way can be
allocated and deallocated from memory within the execution of one
application. To use reference cursors in packages or stored
procedures, use the cfprocresult tag. This causes the ColdFusion JDBC
database driver to put Oracle reference cursors into a result set.
(You cannot use this method with Oracle's ThinClient JDBC drivers.)
Try this
<cfoutput query="getRkBareDrive">#getRkBareDrive.bare_drive_result#</cfoutput>
I suspect Scott's solution will help you out, though if you are still having issues you can add a result parameter to your query to help troubleshoot & see exactly what is being passed.
<cfquery name="getRkBareDrive" datasource= "#PCW_dsn#" result="qryname">
then dump that:
<cfdump var="#qryname#" />
The returned structure should show you the query passed in, number of results etc [i suspect it is actually returning only one], see the full description here: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=Tags_p-q_17.html

Speed up QoQ's or an alternative approach?

I am building an application that performs a master query with many joins. This query data is then available to the whole application to play around with in a global variable. The query refreshes or gets the latest result set on each page refresh; so it's only in the same state for the life of the request.
In other parts of this application, I sometimes run 100's of QoQ's on this data - usually the result of recursive function calls. However, while QoQ is a great feature, it's not too fast and sometimes page loads can be between 3000 - 5000 ms on a bad day. It's just not fast enough.
Is there any kind of optimisation techniques I can do to make QoQ perform faster or perhaps an alternative method? I read an interesting article by Ben Nadel on Duplicate() function - is there any scope for using that and if so, how?
I would love to hear your thoughts.
Don't worry about crazy suggestions, this is a personal project so I'm willing to take risks. I'm running this on Railo compatible with CF8.
Many thanks,
Michael.
Without seeing the code and complexity of the QoQs it is hard to say for sure the best approach, however one thing you can do is use a struct to index the records outside of a QoQ. Much of the overhead of using QoQ is building new query objects, and using a struct write only approach is much more efficient than for example looping over the original query and making comparisons.
For example:
<!--- build up index --->
<cfset structindex = {} />
<cfset fields = "first,last,company" />
<cfloop list="#fields#" index="field">
<cfset key = "field:#field#,value:#q[field][currentrow]#" />
<!--- initialize each key (instead of using stuctkeyexists) --->
<cfloop query="q">
<cfset structindex[key] = "" />
</cfloop>
<cfloop query="q">
<!--- update each key with list of matching row indexes --->
<cfset structindex[key] = listappend(structindex[key], currentrow) />
</cfloop>
</cfloop>
<!--- save structindex to global variable --->
<!--- output rows matching index --->
<cfset key = "field:company,value:stackexchange" />
<cfoutput>
<cfloop list="#structindex[key]#" index="row">
#q.last[row]#, #q.first[row]# (#q.company[row]#)<br />
</cfloop>
</cfoutput>
If this doesn't match your need provide some examples of the QoQ statements and how many records are in the main query.
First, I would look at the time taken by the master query. If it can be cached for some mount of time and is taking a good chunk of the pageload time, I would cache it.
Next, I would look at the recursive calls. If they can be made iterative, that would probably speed things up. I realize this is not always possible. I would be surprised if this isn't your biggest time sink. without knowing more about what you are doing, though, it's hard to help you optimize this.
I might also consider writing some of the recursive QoQs s stored procedures on the DB server, which is designed to handle data quickly and slice and dice efficiently. CF is not -- QoQs are very useful, but not speed demons (as you've noted).
Finally, I would look for straightfoward filters, and not use QoQ. Rather, I would just run a loop over the master query in a standard cfoutput tag, and filter on the fly. This means you are looping over the master query once, rather than the master query once and the result query once.
There are two primary solutions here. First you could do something in CF with the records outside of QoQ. I posted my suggestion on this already. The other is to do everything in the db. One way I've found to do this is to use a subquery as a temp table. You can even keep the sql statement in a global variable and then reference it in the same places you are currently with the QoQ but doing a real query to the database. It may sound slower than one trip tothe DB and then many QoQ but in reality it probably isn't if indexed efficiently.
select *
from (
#sqlstring#
) as tmp
where company = 'stackexchange'
I have actually done this for system with complex criteria for both what records a user should have access to and then also what they can filter for in those records. Going with this approach means you always know the source of the inner records instead of trying to ensure every single query is pulling correctly.
Edit:
It is actually safer (and usually more efficient) to use queryparams when ever possible. I found this can be done by including a file of the sql statement...
select *
from (
<cfinclude template="master_subquery.cfm" />
) as tmp
where company = 'stackexchange'

How can one get a list of all queries that have run on a page in ColdFusion 9

I would like to add some code to my Application.cfc onRequestEnd function that, if a certain application variable flag is on, will log query sql and execution time to a database table. That part is relatively easy, since ColdFusion returns the sql and execution time as part of the query struct.
However, this site has probably close to 1000 pages, and modifying all of them just isn't realistic. So I'd like to do this completely programmatically in the onRequestEnd function. In order to do that I need to somehow get a list of all queries that have executed on the page and that's where I'm stumped.
How can I get a list of the names of all queries that have executed on the current page? These queries appear in the template's variables scope, but there are a myriad of other variables in there too and I'm not sure how to easily loop through that and determine which is a query.
Any help would be appreciated.
Since that information is available via the debugging templates, you might take a look at those files for some pointers.
Another thing to consider is encapsulating your queries in a CFC or custom tag and having that deal with the logging (but I suspect that your queries are spread all over the site so that might be a lot of pages to modify - although that speaks to why encapsulating data access is a good idea: it's easier to maintain and enhance for exactly this sort of situation).
The relevant code from the debug templates (modernized a bit), is:
<cfset tempFactory = createObject("java", "coldfusion.server.ServiceFactory") />
<cfset tempCfdebugger = tempFactory.getDebuggingService() />
<cfset qEvents = tempCfdebugger.getDebugger().getData() />
<cfquery dbType="query" name="qdeb">
SELECT *, (endTime - startTime) AS executionTime
FROM qEvents WHERE type = 'SqlQuery'
</cfquery>

How can I use query-of-query UNION on n-recordsets when var scoping is needed?

I would like to be able to do a query of a query to UNION an unknown number of recordset. However when doing a query-of-query dots or brackets are not allowed in record set names.
For example this fails:
<cfquery name="allRecs" dbtype="query">
SELECT * FROM recordset[1]
UNION
SELECT * FROM recordset[2]
</cfquery>
Using dynamic variable names such as "recordset1" work but this is in a function and needs to be var-scoped so I can't build up the variable names dynamically without producing memory leaks in a persisted object.
Any other ideas?
After posting the question I came up with a couple solutions but there might be a better one out there
I could write dynamically named variables to the arguments scope and then reference them without their scope in query
Create a function that accepts 2 recordsets as arguments and returns one combined recordset. This could be looped over to progressively add a recordset at a time. I'm sure this is very inefficient compared to doing all UNIONs in one query though.
Difficult task. I could imagine a solution with a nested loop based on GetColumnNames(), using QueryAddRow() and QuerySetCell(). It won't be the most efficient one, but it is not really slow. Depends on the size of the task, of course.
Your "create a function that combines two recordsets" could be made much more efficient when you create it to accept, say, ten arguments. Modify the SQL on the fly:
<cfset var local = StructNew()>
<cfquery name="local.union" dbtype="query">
SELECT * FROM argument1
<cfloop from="2" to="#ArrayLen(arguments)#" index="local.i">
<cfif IsQuery(arguments[local.i])>
UNION
SELECT * FROM argument#local.i#
</cfif>
</cfloop>
</cfquery>
<cfreturn local.union>
After a quick bit of poking around, I found this:
queryConcat at CFLib.org. It uses queryaddrow/querysetcell to concatenate two queries.
I added a quick function (with no error checking, or data validation, so I wouldn't use it as-is):
<cffunction name="concatenate">
<cfset var result = arguments[1]>
<cfloop from="2" to="#arraylen(arguments)#" index="i">
<cfset result=queryconcat(result, arguments[i])>
</cfloop>
<cfreturn result>
</cffunction>
As a test, I threw this together:
Which does, in fact, give you fred/sammy/fred.
It's probably not the most efficient implementation, but you can always alter the insert/union code to make it faster if you wanted. Mostly, I was aiming to write as little code as possible by myself. :-)
all of the solutions added here should work for you, but I would also mention that depending on how much data you are working with and the database you are using, you might be better off trying to find a way to do this on the database side. With very large record sets, it might be beneficial to write the records to a temporary table and select them out again, but either way, if you can in any way rewrite the queries to let the database handle this in the first place you will be better off.