Sitecore computed index fields - how to store untokenized

Sitecore computed index fields - how to store untokenized - sitecore

Sitecore 8.1 with default lucene. I'm using a custom index with a computed field to store the true values of a multilist field, rather than the guids. This works and I can see within the Luke tool that the field are indexed as text.
Some of the values contain spaces, but I want these to be indexed as the whole field. The problem is these are being indexed in a tokenized fashion, so for example 'Little Hampton' is being indexed as 'Little' and 'Hampton'.
How do I get computed fields to be stored untokenized? See the raw:AddComputedIndexField section:
<indexConfigurations>
<myCustomIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">
<indexAllFields>true</indexAllFields>
<initializeOnAdd>true</initializeOnAdd>
<analyzer ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/analyzer" />
<fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
<fieldNames hint="raw:AddFieldByFieldName">
<!-- you must have _uniqueid or you wont be able to update the document later -->
<field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
<analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
<field fieldName="title" storageType="YES" indexType="UNTOKENIZED" vectorType="YES" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
<field fieldName="summary" storageType="NO" indexType="TOKENIZED" vectorType="YES" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
</fieldNames>
</fieldMap>
<fields hint="raw:AddComputedIndexField">
<!-- resolves selected guids to text values -->
<field storageType="NO" indexType="UNTOKENIZED" fieldName="my multilist field">My.CoolStuff.Class, My.CoolStuff</field>
</fields>
I've tried adding storageType="NO" indexType="UNTOKENIZED" to the field but without affect, it remains tozenized and stored.

Try and add your computed field to the regular <fieldNames hint="raw:AddFieldByFieldName"> section IN ADDITION to the computed field, and specify an analyzer.
for example:
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="my multilist field" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
<Analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>
</fieldNames>

I can suggest one thing for you and it should work:
Create a computed field that replacing the spaces in title field with " _ ", when you search replace any the space in your search keyword with " _ " so your it will be like : Little_Hampton

Related

How to define Decison table in DMN

How to define a Decision table in DMN designer such that,
This is one of the standard decisions : If claim amount is >10000 then hradmin approval =Y otherwise hradmin approval=N

In DMN,
a Decision table in DMN designer such that [...] If claim amount is >10000 then hradmin approval =Y otherwise hradmin approval=N
can be modeled as the following DRG for 1 InputData named claim amount and 1 Decision for the table named hradmin approval:
The hradmin approval decision table can be defined as follows:
The screenshot also shows sample data, matching your original requirements.
You can download the .dmn example here: https://kiegroup.github.io/kogito-online/?file=https://gist.githubusercontent.com/tarilabs/f9655e2f8a2c4253e66ce661e5c79879/raw/so69764028.dmn#/editor/dmn

You can save the xml to a text file, upload it, try it out and modify it here: https://consulting.camunda.com/dmn-simulator/
<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="https://www.omg.org/spec/DMN/20191111/MODEL/" xmlns:dmndi="https://www.omg.org/spec/DMN/20191111/DMNDI/" xmlns:dc="http://www.omg.org/spec/DMN/20180521/DC/" xmlns:di="http://www.omg.org/spec/DMN/20180521/DI/" xmlns:camunda="http://camunda.org/schema/1.0/dmn" id="dinnerDecisions" name="HR Approval Decision" namespace="http://camunda.org/schema/1.0/dmn" exporter="Camunda Modeler" exporterVersion="4.0.0">
<decision id="beverages" name="HR Approval">
<informationRequirement id="InformationRequirement_1xvojck">
<requiredInput href="#InputData_0pgvdj9" />
</informationRequirement>
<decisionTable id="DecisionTable_07q05jb">
<input id="InputClause_0bo3uen" label="Amount" camunda:inputVariable="">
<inputExpression id="LiteralExpression_0d6l79o" typeRef="integer">
<text>amount</text>
</inputExpression>
</input>
<output id="OuputClause_99999" label="HR Approval" name="hrApproval" typeRef="boolean" />
<rule id="row-506282952-7">
<description></description>
<inputEntry id="UnaryTests_0jb8hau">
<text>>10000</text>
</inputEntry>
<outputEntry id="LiteralExpression_1kr45vj">
<text>true</text>
</outputEntry>
</rule>
<rule id="DecisionRule_05oqdbw">
<description></description>
<inputEntry id="UnaryTests_1vcdz6c">
<text><=10000</text>
</inputEntry>
<outputEntry id="LiteralExpression_0g5cscd">
<text>false</text>
</outputEntry>
</rule>
</decisionTable>
</decision>
<inputData id="InputData_0pgvdj9" name="Amount" />
<dmndi:DMNDI>
<dmndi:DMNDiagram id="DMNDiagram_0i21c0s">
<dmndi:DMNShape id="DMNShape_0a1lk6d" dmnElementRef="beverages">
<dc:Bounds height="80" width="180" x="430" y="130" />
</dmndi:DMNShape>
<dmndi:DMNEdge id="DMNEdge_1czaglz" dmnElementRef="InformationRequirement_1xvojck">
<di:waypoint x="500" y="287" />
<di:waypoint x="520" y="230" />
<di:waypoint x="520" y="210" />
</dmndi:DMNEdge>
<dmndi:DMNShape id="DMNShape_0aea4xy" dmnElementRef="InputData_0pgvdj9">
<dc:Bounds height="45" width="125" x="437" y="287" />
</dmndi:DMNShape>
</dmndi:DMNDiagram>
</dmndi:DMNDI>
</definitions>

How do I get Lucene.NET to combine 2 Sitecore fields in to 1 index field?

I am using Lucene.NET with Sitecore for searching. I have created a custom Lucene index. Normally it is a one-to-one mapping between Sitecore fields and Lucene index fields. I would like to be able to take 2 fields and combine them in the Lucene index. Below is an example of my custom index definition. You will see a field called Activity and a field called Board. Then below it is an example of what I am trying to do - combine Activity and Board in to one field in the index. I just am not sure if this is possible and if so, what the syntax is for defining a combined field like this. Any ideas?
<index id="reportsIndex" singleInstance="true" type="IOM.library.CustomIndexer, IOM">
<param desc="name">$(id)</param>
<template hint="list:AddTemplate">
<template>{79EBE484-BAD6-4173-B80A-29AC7D734565}</template>
</template>
<fields hint="raw:AddField">
<field target="Title">Title</field>
<field target="SortTitle" storage="keyword">Title</field>
<field target="ShortDescription">ShortDescription</field>
<field target="FullDescription">FullDescription</field>
<field target="Topic">Topic</field>
<field target="Type">Type</field>
<field target="ReleaseDate">ReleaseDate</field>
<field target="Series">Series</field>
<field target="Activity">Activity</field>
<field target="Board">Board</field>
<field target="MyCombinedField">??Activity, Board??</field>
</fields>
</index>
UPDATE: I tried to do what people have suggested and map 2 different Sitecore fields to the same Lucene field. However that doesn't seem to work. I tried the following:
<index id="reportsIndex" singleInstance="true" type="IOM.library.CustomIndexer, IOM">
<param desc="name">$(id)</param>
<template hint="list:AddTemplate">
<template>{79EBE484-BAD6-4173-B80A-29AC7D734565}</template>
</template>
<fields hint="raw:AddField">
<field target="Title">Title</field>
<field target="Activity">Activity</field>
<field target="Board">Board</field>
<field target="MyCombinedField">Activity</field>
<field target="MyCombinedField">Board</field>
</fields>
</index>
When I look in IndexViewer this is what I see. If the content item has content for the Activity field then that will get populated in the "MyCombinedField" (since it is first). If the Activity field has no content then Lucene will populate the "MyCombinedField" with the Board content. But it never puts both field's content in to the MyCombinedField field. Am I doing something wrong?

You must be using the old data indexes. Are you running pre Sitecore 6.5? You might consider rewriting your code to use Sitecore.Search.
Anyway you can index multiple Sitecore Fields in the same Lucene field by something similar to this:
<index id="system" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<fields hint="raw:AddField">
<field target="name">#name</field>
<field target="name">__created</field>
<field target="name">#tid</field>
In this case both the name of the item, the created date field and the template id is indexed in the same field.
So in short: Just create multiple field elements with the same target attribute

Sitecore Lucene index

We've recently deployed to a clients environment and we're not seeing news items - these are found using a Lucene search based on a template id
I can only think that Lucene isn’t finding them.. I’ve rebuilt the search indexes and we're definitely search for the right templates.
Im thinking the news isn’t being included in found items by Lucene. I cant see anything in Sitecore.SharedSource.Search.config that is preventing results from being returned. The search index is working for other items (we use it for menus for instance).
Any ideas? I should add that we have added our sitecore site to an existing project, developed externally, and there may be library code/configuration that we're not currently aware of exactly what it's doing!
Here's the configuaratioon for the index from Sitecore.SharedSource.Search.config
<index id="advancedmaster" type="Sitecore.Search.Index, Sitecore.Kernel">
<param desc="name">$(id)</param>
<param desc="folder">advanced_master</param>
<Analyzer ref="search/analyzer" />
<locations hint="list:AddCrawler">
<master type="Sitecore.SharedSource.Search.Crawlers.AdvancedDatabaseCrawler,Sitecore.SharedSource.Search">
<Database>master</Database>
<Root>/sitecore/content</Root>
<IndexAllFields>true</IndexAllFields>
<include hint="list:ExcludeField">
<!-- __revision field -->
<fieldId>{8CDC337E-A112-42FB-BBB4-4143751E123F}</fieldId>
<!-- __context menu field -->
<fieldId>{D3AE7222-425D-4B77-95D8-EE33AC2B6730}</fieldId>
<!-- __security field -->
<fieldId>{DEC8D2D5-E3CF-48B6-A653-8E69E2716641}</fieldId>
<!-- __renderings field -->
<fieldId>{F1A1FE9E-A60C-4DDB-A3A0-BB5B29FE732E}</fieldId>
</include>
<fieldCrawlers hint="raw:AddFieldCrawlers">
<fieldCrawler type="Sitecore.SharedSource.Search.FieldCrawlers.LookupFieldCrawler,Sitecore.SharedSource.Search" fieldType="Droplink" />
<fieldCrawler type="Sitecore.SharedSource.Search.FieldCrawlers.DateFieldCrawler,Sitecore.SharedSource.Search" fieldType="Datetime" />
<fieldCrawler type="Sitecore.SharedSource.Search.FieldCrawlers.DateFieldCrawler,Sitecore.SharedSource.Search" fieldType="Date" />
<fieldCrawler type="Sitecore.SharedSource.Search.FieldCrawlers.NumberFieldCrawler,Sitecore.SharedSource.Search" fieldType="Number" />
</fieldCrawlers>
<!-- If a field type is not defined, defaults of storageType="NO", indexType="UN_TOKENIZED" vectorType="NO" boost="1f" are applied-->
<fieldTypes hint="raw:AddFieldTypes">
<!-- Text fields need to be tokenized -->
<fieldType name="single-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="multi-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="word document" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="html" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="rich text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="memo" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Multilist based fields need to be tokenized to support search of multiple values -->
<fieldType name="multilist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelistex" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="checklist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Legacy tree list field from ver. 5.3 -->
<fieldType name="tree list" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
</fieldTypes>
</master>
</locations>
</index>

The problem was we still had workflow enabled on these news items (we want it off for testing) .. and the parent item was in a state that needed reviewing so it didn't appear in the search results.
Thanks for your suggestions - all adding to the sea of knowledge !

Are you accidentally excluding a template instead of including a template?
<include hint="list:ExcludeTemplate">
<template>ID HERE</template>
</include>
or
<include hint="list:IncludeTemplate">
<template>ID HERE</template>
</include>
You could also be incorrectly listing your templates. Each need to have a different name.
<include hint="list:IncludeTemplate">
<news>NEWS ID HERE</news>
<event>EVENT ID HERE</event>
</include>

Sitecore TreelistEx search with Lucene.NET

Is there a way to search contents of a TreeListEx field in a custom index inside Sitecore with Lucene.NET? I have tried to use a WildQuery to figure out if an item is part of the TreeListEx field but it's not working. Below is a code sample of what I tried -
WildcardQuery taggingQuery = new WildcardQuery(new Term("country tag", ShortID.Encode("{4ED2F7EE-5C2A-418C-B2F6-236F94166BA1}").ToLowerInvariant()));
I am basically trying to do a "contains" and WildCardQuery is the only way I could figure of doing it.

I should've paid more attention when setting up the index. I forgot to add field analyzers for each field. The multilist fields were getting indexed with a different analyzer instead of the standard analyzer. I added this to my config section for field crawls and my query started working
<fieldTypes hint="raw:AddFieldTypes">
<!-- Text fields need to be tokenized -->
<fieldType name="single-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="multi-line text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="word document" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="html" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="rich text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="memo" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="text" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Multilist based fields need to be tokenized to support search of multiple values -->
<fieldType name="multilist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="treelistex" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<fieldType name="checklist" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
<!-- Legacy tree list field from ver. 5.3 -->
<fieldType name="tree list" storageType="NO" indexType="TOKENIZED" vectorType="NO" boost="1f" />
</fieldTypes>

indexing all documents in doc folder in to solr FileListEntityProcessor

http://wiki.apache.org/solr/ExtractingRequestHandler does not provide much information how to configure this handler in an webapplication which has its own context and want to use solr as server features as embebdedd solr .
Can you please provide some information on how to upload the documents to solr and search for some content from those documents?
I have configured DIH as in solrConf.xml
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">tika-data-config.xml</str>
</lst>
</requestHandler>
and tika-data-config.xml looks like
<dataConfig>
<dataSource type="BinFileDataSource" name="bin" />
<document>
<entity name="sd"
processor="FileListEntityProcessor"
newerThan="'NOW-30DAYS'"
filenName=".*\.(DOC)|(PDF)|(pdf)|(doc)|(docx)|(ppt)"
baseDir="G:/workspace/FacetedSearch/src/solr/docs"
recursive="true"
rootEntity="false"
>
<field column="fileAbsolutePath" name="path" />
<field column="fileSize" name="size" />
<field column="fileLastModified" name="lastmodified" />
<field column="fileAbsolutePath" name="text" />
<!-- <field column="fileName" name="text" /> -->
<field column="baseDir" name="text" />
<!-- <entity name="tika-test" processor="TikaEntityProcessor"
url="${sd.fileAbsolutePath}" format="text" dataSource="bin">
-->
<entity name="tika-test"
dataSource="bin"
processor="TikaEntityProcessor"
url="G:/workspace/FacetedSearch/src/solr/docs"
format="text" >
<field column="Author" name="author" meta="true"/>
<field column="Content-Type" name="title" meta="true"/>
<field column="title" name="title" meta="true"/>
<field column="text" name="text"/>
</entity>
</entity>
</document>
</dataConfig>
the dir G:/workspace/FacetedSearch/src/solr/docs contains many pdf and html files
some of them are tutorial.pdf......index.pdf
after this configuration when i build solrQuery object as
CoreContainer.Initializer initializer = new CoreContainer.Initializer();
CoreContainer coreContainer = initializer.initialize();
EmbeddedSolrServer solrServer = new EmbeddedSolrServer(coreContainer, "");
SolrQuery solrQuery = new SolrQuery();
solrQuery.addField("literal.id");
solrQuery.setQuery("index.pdf");
QueryResponse queryResponse = null ;
try{
queryResponse = (QueryResponse) solrServer.query(solrQuery);
}catch(Exception e){
System.out.println("exception occured while processing the solrQuery "+
e.getMessage() +"stack trace " + e + solrQuery.toString());
}
out.println(queryResponse);
i do not get any result (here queryResponse is null).
I have the schema.xml distributed by solr 3.5 and added some fields as
<field name="path" type="text_general" indexed="true" stored="true" />
<field name="lastmodified" type="date" indexed="true" stored="true" />
I have question like are the documents in "G:/workspace/FacetedSearch/src/solr/docs"
will be indexed by solr on solr startup?
If these are indexed how can i get the result?
Can any one please let me know where i am doing wrong?
Please let me know if any more information needed from me in getting my answeres.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Sitecore computed index fields - how to store untokenized - sitecore

I can suggest one thing for you and it should work: Create a computed field that replacing the spaces in title field with " _ ", when you search replace any the space in your search keyword with " _ " so your it will be like : Little_Hampton

Related

How to define Decison table in DMN

How do I get Lucene.NET to combine 2 Sitecore fields in to 1 index field?

Sitecore Lucene index

Sitecore TreelistEx search with Lucene.NET

indexing all documents in doc folder in to solr FileListEntityProcessor

Categories

Resources