I’ve created a file index of all of my ColdFusion files so I can quickly search the files and find what I’m looking for. So far, it’s working great except it doesn’t seem to be searching inside any ColdFusion tags.
For example…
<p>If I searched for this text, It would return a result</p>
<cfset variables.foo = "however, If I search for this text it wouldn’t return any results." />
Does anyone know if there’s a way to search inside of a ColdFusion tag like that?
This is my index..
<cfindex
collection = "fileIndex"
action="refresh"
type="path"
key="d:\my-websites-location\"
urlpath="http://mywebsite/"
extensions=".cfm, .cfml, .cfc"
recurse="Yes">
This is my search…
<cfsearch
name = "testSearch"
collection = "fileIndex"
type="internet"
criteria = "variables.foo"
/>
Any ideas?
Thanks,
Paul : )
It looks like the type="internet" may be your issue. Try removing the "type" attribute and see what you get.
Use a query that does get the record and look at the "summary" field of your result. I suspect the markup is being stripped.
On ColdFusion 9, with solr it doesn't index the markup, however Verity does. A workaround you could use a combination of cffile/cfdirectory to read each file one by one and feed it into the collection. This will preserve the markup and make it searchable.
Or you can enclose your criteria variable with ##.
<cfsearch
name = "testSearch"
collection = "fileIndex"
type="internet"
criteria = "#variables.foo#"
/>
Related
I'm trying to read some values from the XML file which I created, but it gives me the following error:
coldfusion.runtime.UndefinedElementException: Element MYXML.UPLOAD is undefined in XMLDOC.
Here is my code
<cffile action="read" file="#expandPath("./config.xml")#" variable="configuration" />
<cfset xmldoc = XmlParse(configuration) />
<div class="row"><cfoutput>#xmldoc.myxml.upload-file.size#</cfoutput></div>
Here is my config.xml
<myxml>
<upload-file>
<size>15</size>
<accepted-format>pdf</accepted-format>
</upload-file>
</myxml>
Can someone help me to figure out what is the error?
When I am printing the entire variable as <div class="row"><cfoutput>#xmldoc#</cfoutput></div> it is showing the values as
15 pdf
The problem is the hyphen - contained in the <upload-file> name within your XML. If you are in control of the XML contents the easiest fix will be to not use hyphens in your field names. If you cannot control the XML contents then you will need to do more to get around this issue.
Ben Nadel has a pretty good blog article in the topic - Accessing XML Nodes Having Names That Contain Dashes In ColdFusion
From that article:
To get ColdFusion to see the dash as part of the node name, we have to "escape" it, for lack of a better term. To do so, we either have to use array notation and define the node name as a quoted string; or, we have to use xmlSearch() where we can deal directly with the underlying document object model.
He goes on to give examples. As he states in that article, you can either quote the node name to access the data. Like...
<div class="row">
<cfoutput>#xmldoc.myxml["upload-file"].size#</cfoutput>
</div>
Or you can use the xmlSearch() function to parse the data for you. Note that this will return an array of the data. Like...
<cfset xmlarray = xmlSearch(xmldoc,"/myxml/upload-file/")>
<div class="row">
<cfoutput>#xmlarray[1].size#</cfoutput>
</div>
Both of these examples will output 15.
I created a gist for you to see these examples as well.
How to get droplink and treelist values in sitecore search .
Below are my code and config file . But when i am searching based on droplink and treelist value its not coming in search result .
var fquery = new FullTextQuery(search);
SearchHits searchHits = sc.Search(fquery, int.MaxValue);
return searchHits.FetchResults(0, int.MaxValue).Select(r => r.GetObject()).ToList();
config file entry .
I am not sure if i have to parse them or something else . Looking forward for help.
You don't say which version of Sitecore you're using, but speaking as someone who works with v6.6:
ID-based fields, like TreeList store GUIDs in the Sitecore database. At index time, Sitecore parses these into ShortID format and forces it to lower case. So the Lucene index entry actually contains a lowercase GUID with no curly braces or hyphens.
Chances are, your text-based query is not going to contain text that will match this.
I tend to use a Term based BooleanQuery object to match ID-based fields. Something like:
BooleanQuery query = new BooleanQuery();
query.Add(new TermQuery(new Term("myfieldname", ShortID.Encode(myGuidToMatch).ToLowerInvariant())), BooleanClause.Occur.MUST);
Note that the field name you want to query should be in lower case, as Sitecore / Lucene generally works in lower case.
You may find the code and UI example in this blog post helpful to see an example of building queries against ID-based fields:
http://jermdavis.wordpress.com/2014/06/09/faceted-search-in-sitecore-6-6/
If you want to be able to match the values contained in these fields from a free text type of search box, then you will have to pre-process the values from these ID-based fields before they are indexed.
Sitecore and Lucene allow for the idea of "computed fields" in your index - basically you can configure the indexing process to run bits of your own code in order to process data at index time, and to create new Lucene index fields from the results of your code.
This blog post of mine gives an example of a computed field:
http://jermdavis.wordpress.com/2014/05/05/using-dms-profile-cards-as-search-metadata/
That example is not doing what you want - but it does talk about how you might configure one of these custom fields.
You'd probably want your custom field code to:
Get the raw value of the ID-based field
Load the item that this ID points to
Process that item to turn it into the pattern of text you want to be indexed
Return this text, to be saved into the computed field in Lucene
With that done, you should find that your index will contain the text associated with your ID field(s). And hence you should be able to match it with a text-based query.
-- EDITED TO ADD --
More detail on creating computed index items:
Sitecore 6.6 doesn't directly support computed fields in your Lucene indexes. To get them you can make use of the Advanced Database Crawler - which is part of the Sitecore SearchContrib project available on GitHub: https://github.com/sitecorian/SitecoreSearchContrib
There are assorted blog posts about on getting started with this code.
[NB: In Sitecore 7.x I believe this behaviour has migrated into the core of Sitecore. However I think they changed the names of stuff. Details of that are available via google - things like Upgrading sitecore 6.6 index configuration to sitecore 7 (using ComputedFields) for example]
The code for a dynamic field to turn something ID-based into text might look like:
public class IndexIDField : BaseDynamicField
{
public override string ResolveValue(Sitecore.Data.Items.Item item)
{
Field fld = item.Fields["FieldYouAreInterestedIn"];
if (fld != null && !string.IsNullOrWhiteSpace(fld.Value))
{
string[] ids = fld.Value.Split('|');
StringBuilder text = new StringBuilder();
foreach (string id in ids)
{
Item relatedItem = item.Database.GetItem(id);
if (relatedItem != null)
{
text.Append(relatedItem.DisplayName);
text.Append(" ");
}
}
return text.ToString();
}
return null;
}
}
This is extracting the appropriate field from the context item that's being passed in. If it exists and is not empty, it splits it by "|" to get a list of all the IDs stored in this field. Then for each one it tries to load it. Note use of the appropriate database specified by the input item - Sitecore.Context.Database will point to Core at this point, and you won't find your items there. Finally, if we get back a valid item from the ID, we append its display name to our text to be indexed. You could use other fields than Display Name - depending on what makes sense in your solution.
With that code added to your solution you need to ensure it's called at index build time. The default config for the Advanced Database Crawler includes a config element for dynamic fields. (And again, SC7.x will have something similar but I don't know the names off the top of my head) You need to add your type to the configuration for dynamic fields. Snipping out all the extraneous bits from the default config:
<configuration xmlns:x="http://www.sitecore.net/xmlconfig/">
<sitecore>
<!-- snip -->
<search>
<!-- snip -->
<crawlers>
<demo type="scSearchContrib.Crawler.Crawlers.AdvancedDatabaseCrawler,scSearchContrib.Crawler">
<!-- snip -->
<dynamicFields hint="raw:AddDynamicFields">
<!-- snip -->
<dynamicField type="YourNamespace.IndexIDField,YourDLL" name="_myfieldname" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
</dynamicFields>
<!-- snip -->
</demo>
</crawlers>
</search>
</sitecore>
</configuration>
That sets up a new field called "_myfieldname" with sensible options for indexing text.
Rebuild your search index, and you should find your free text queries will match the appropriate items. Testing this basic setup on an instance of SC6.6, I get hits with some test data I happened to have lying around. If I search my computed column for "blue" I get only rows which were tagged with a metadata item that had "blue" in its name:
was wondering if anyone could point me in the right direction on this one.
The system I'm building allows users to add comments to records. If the user wants to reference another record in their comment I want them to be able to use the # symbol followed by the 5 digit record id.
For example when a user submits a comment like "Updated details for this record and record #25645" I need it to be outputted with a href around the "#25645" which will link to the record in question.
I'm trying to use REReplaceNoCase with limited success.
<cfset LinkableComments = REReplaceNoCase(Comments, "#[0-9][0-9][0-9][0-9][0-9]", "Test", "all") />
I can't figure out how to get the records id back into the URL's.
Any thoughts?
Leigh is correct, you can use a back reference like so:
<cfset LinkableComments = REReplaceNoCase(Comments, "#([0-9][0-9][0-9][0-9][0-9])", "#\1", "all") />
Basically you place the regex "group" you wish to capture and replace inside ( and ), then refer to that in the replace with argument as \1 ( because this was the first group, \2 for second, etc... )
Try it yourself here ( <-- Edit: Gist was updated to show the #([0-9]{5}) syntax suggested by #PeterBoughton)
According to the documentation on REReplace, you ought to be able to use the following:
REReplaceNoCase(Comments, "#([0-9]{5})", "Test", "all")
I'll test when my CF server is back up again
I'm just asking to check if I'm doing things the right way. I want to add advanced search capabilities to my django app, and I started to test Haystack with SOLR as backend.
As I do really need partial matches, I've modified my schema.xml so the text field defined by haystack is of type nGram, like this:
<field name="text" type="ngram" indexed="true" stored="true" multiValued="false" />
Partial matches are now working with the default view included in haystack, so for a Model called John, if I look for "Joh" it's found, as it is "ohn" or any 3 letters combination.
Is this the right way? Why isn't the text field of ngram type by default, because of performance issues?
thanks a lot!
See this and this. I'm not sure how it responds in terms of preformance, but you can achieve the partial match behavior, also using ngrams, without having to touch your solr schema. Just define your index's text fields as EdgeNgramField.
AS 3.0 / Flash
I am consuming XML which I have no control over.
the XML has HTML in it which i am styling and displaying in a HTML text field.
I want to remove all the html except the links.
Strip all HTML tags except links
is not working for me.
does any one have any tips? regEx?
the following removes tables.
var reTable:RegExp = /<table\s+[^>]*>.*?<\/table>/s;
but now i realize i need to keep content that is the tables and I also need the links.
thanks!!!
cp
Probably shouldn't use regex to parse html, but if you don't care, something simple like this:
find /<table\s+[^>]*>.*?<\/table\s+>/
replace ""
ActionScript has a pretty neat tool for handling XML: E4X. Rather than relying on RegEx, which I find often messes things up with XML, just modify the actual XML tree, and from within AS:
var xml : XML = <page>
<p>Other elements</p>
<table><tr><td>1</td></tr></table>
<p>won't</p>
<div>
<table><tr><td>2</td></tr></table>
</div>
<p>be</p>
<table><tr><td>3</td></tr></table>
<p>removed</p>
<table><tr><td>4</td></tr></table>
</page>;
clearTables (xml);
trace (xml.toXMLString()); // will output everything but the tables
function removeTables (xml : XML ) : void {
xml.replace( "table", "");
for each (var child:XML in xml.elements("*")) clearTables(child);
}