i am new to solr and django,i am working on implimenting search on a party hall venue search website though i have not worked on the website part just implimenting solr for the search i have indexed party hall venue information in solr with following fields
<field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" />
<field name="title" type="text_general" indexed="true" stored="true" required="true" />
<field name="slug" type="text_general" indexed="true" stored="true" required="true" />
<field name="description" type="text_general" indexed="true" stored="true"/>
<field name="location" type="text_general" indexed="true" stored="true"/>
<field name="city" type="text_general" indexed="true" stored="true"/>
<field name="area" type="text_general" indexed="true" stored="true"/>
<field name="featured" type="boolean" indexed="true" stored="true" />
<field name="facilities" type="text_general" indexed="true" stored="true" multiValued="true" />
<field name="type_of_venue" type="text_general" indexed="true" stored="true" multiValued="true" />...
there are many other fields which are used only for display of data on results page but only these fields are used to query data
on my website i have a search bar where user can enter any search term and then i search it against title,description,location,facilities
now i did enough reading on how to break down the search terms entered by user to identify which field to find it in so that i can have different template view to show the found matches but couldn't find any technique that will work with solr.
please can anyone suggest me any pre search text processing techniques to make it simpler to generate query to search in solr
thanks in advance
If you index your data carefully (read on examples) there is an easy (but not the best!) way to do this.
Lets say here are your party hall documents
1) party hall - "abc party hall", location - "san jose"
2) party hall - "xyz party hall", location - "san francisco"
3) party hall - "pqr party hall", location - "paris"
4) party hall - "best party hall", location - "san jose"
Lets say your user types "best party hall in san jose" in search bar, ideally you should return #4, 1, right?
You can certainly pre-process your query (complex NLP) to extract potential location data to be used for location field in your query.
For a moment lets take brute-force method here, lets use boolean query and search your full query with all important fields as-is
party_hall: "best party hall in san jose" AND location: "best party hall in san jose"
If you've indexed your data properly (as given in example documents above) you will get best results as expected.
party hall query will not have "san jose" so it will consider document with "best party hall", similarly location field will filter document with "san jose" so technically you should get your best matched documents #4,1. You can use "OR" instead of "AND" but you will get more matched documents (but rank sorting will still be accurate and expected)
Try it with your use case and see if it helps!
p.s - this will work if you use any tokenizer based analyzer like StandardAnalyzer (will not work for KeywordAnalyzer)
Related
I have a product index with keywords as multivalued field
class ProductIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
keywords = indexes.MultiValueField(faceted=True)
def prepare_keywords(self, product):
return [p.name for p in product.tags.all()]
I need to find Products having keywords exactly as lightning . I use this query -
SearchQuerySet().models(Product).filter(keywords__exact=u'Lighting')
But this also gives me the Products having lightning as a part of the word. Like
print SearchQuerySet().models(Product).filter(keywords__exact=u'Lighting')[1].keywords
[u'LED lighting', u'Optic Lighting']
What is the correct way of doing this?
just look at example/solr/collection1/conf/schema.conf
<field name="keywords" type="text_en" indexed="true" stored="true" multiValued="true" />
i think the exact only worked on string fields not text fields. you must to edit this configuration and create keyword field like this:
<field name="keywords" type="string" indexed="true" stored="true" multiValued="true" />
I am working on a small back-end application on Sitecore 8. As a feature, the application has to quickly search through thousands of items and find which one is not publishable. As far as I know I need to add relevant field to the lucene index. I did a research on google and found people can access to this property through __Never publish field. For example they use it on sitecore powershell to switch this boolean property ( I tried it and it works).
However I am struggling to make this working in Lucene index. I added something like this to my index definition on master database:
<configuration ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration">
<fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="title" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
...
<field fieldName="__Never publish" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
</fieldNames>
</fieldMap>
</configuration>
Any field I add to index (even some built-in fields) can get indexed and the content can get stored as well. (like the "title" in the above example) but I don't understand why the neverpublish field doesn't.
I looked into other configuration files and found it is excluded from being indexed inside Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config and the definition is like this:
<exclude hint="list:ExcludeField">
....
<NeverPublish>{9135200A-5626-4DD8-AB9D-D665B8C11748}</NeverPublish>
....
<exclude>
Then I commented the it out but still no luck. I wonder if I am referencing the field name correctly or there is anything else I should no. Any suggestion?
This configuration works fine for me:
<field fieldName="__never publish" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.Boolean" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider" />
It differs from yours with type: type="System.Boolean"
p.s.: And of course excluding from search should be commented as you mentioned above
This is very similar to another question concerning how to switch a type from 'varchar' to 'char'. However, I'm having trouble getting my XML config to work.
One of the fields that app/console doctrine:schema:update --dump-sql is saying differs from my database is this:
ALTER TABLE user CHANGE password password VARCHAR(32) NOT NULL;
I'm pretty certain that that's because my database definition for that field is correctly set to char(32), but Doctrine isn't expecting the field to be fixed.
To poke Doctrine into knowing it's fixed, I've added the following XML:
<field name="password" type="string" column="password" length="32" nullable="false">
<options>
<option name="fixed" value="true" />
</options>
</field>
Is this not correct?
Some bonus information, if it matters: I'm using MySQL.
Just stumbled upon on your thread.
The syntax in your original post was incorrect. I wanted to leave my comment just in-case someone else happens to find this thread. While the accepted answer works, the 'fixed option' in the Doctrine schema works too.
With the code snipet above, it should look like this:
<field name="password" type="string" column="password" length="32" nullable="false">
<options>
<option name="fixed">true</option>
</options>
</field>
The attribute value is not valid.
Hope this helps whoever ventures on this thread!
I've stumbled upon a reasonable enough answer, but it's definitely a bit of a hack.
<field name="password" type="string" column="password" length="32" nullable="false" column-definition="CHAR(32) NOT NULL" />
length and nullable are actually ignored here, but I figured I'd leave them in the be verbose about what I want. The column-definition is actually what's forcing the fixed length CHAR.
Unfortunately, this attribute doesn't seem to get taken into account when doing the schema diff, so it'll always say there's a difference on that field.
I'm indexing data from a XML file, with many fields like these declared in DataImportHandler's dataconfig.xml :
<field column="pos_A" xpath="/positions/pos_A/#pos" />
<field column="pos_B" xpath="/positions/pos_B/#pos" />
<field column="pos_C" xpath="/positions/pos_C/#pos" />
...
And one matching dynamicField declaration in schema.xml :
<dynamicField name="pos_*" type="sint" indexed="true" stored="true" />
I'm wondering if it's possible to use a transformer to dynamically generate the field names in dataconfig.xml, and have a single line, kinda like :
<field column="pos_{$1}" xpath="/positions/pos_(*)/#pos" />
(pardon my xpath and regex syntax :)
https://issues.apache.org/jira/browse/SOLR-3251 The latest release claims that you can dynamically add fields to the schema. I tried to find documentation for the public interface, but not much luck so far.
>
SOLR-4658: In preparation for REST API requests that can modify the schema,
126 a "managed schema" is introduced.
127 Add '<schemaFactory class="ManagedSchemaFactory" mutable="true"/>' to solrconfig.xml
128 in order to use it, and to enable schema modifications via REST API requests.
129 (Steve Rowe, Robert Muir)
I'm fairly new to SharePoint 2010, I've had some experience with 2007 but only debugging and fixing some small bug.
Assuming that i create a new solution for SP 2010 in VS2010 and i add a feature to create some list definitions and also some list instances of those list definition templates. These are all declared through Schema.xml =>
I deploy successfully and add a few items to my new lists.
Now i want to add a few extra columns (fields) to my lists, how will i deploy them?
I don't want to create them in code, i would like to have a up to date solution that with every new developer a simple deployment would create an up and running Dev environment.
What is the correct way to do the deployment in this case?
If you have the schema.xml file defined for the list, then you really want to add your new columns using the collection within the list definition. You also want to be sure your list is defined by a content type, allowing for reuse. So within your schema.xml file, it would look something like this:
<List xmlns:ows="Microsoft SharePoint" Title="Test List" FolderCreation="FALSE" Direction="$Resources:Direction;" Url="Lists/Test-List" BaseType="0" xmlns="http://schemas.microsoft.com/sharepoint/">
<MetaData>
<ContentTypes>
<ContentType ID="0x010068a2e063a1a74913a37ecdb61ab2c721" Name="Test" Group="Custom Content Types" Description="Test Description" Inherits="TRUE" Version="0">
<FieldRefs>
<FieldRef ID="{c2f80e7d-666e-4273-8b58-d5c8a13a9d6a}" Name="Col1" ShowInNewForm="TRUE" Required="TRUE" ShowInEditForm="TRUE"/>
<FieldRef ID="{a84d620a-d42d-455c-8ef8-7e9f1d443250}" Name="Col2" Required="TRUE" ShowInNewForm="TRUE" ShowInEditForm="TRUE"/>
<!-- Your new field refs here here -->
</FieldRefs>
</ContentType>
</ContentTypes>
<Fields>
<Field ID="{c2f80e7d-666e-4273-8b58-d5c8a13a9d6a}" Type="Text" AllowDeletion="FALSE" Description="Key" AllowDuplicateValues="FALSE" EnforceUniqueValues="TRUE" Indexed="TRUE" Name="Col1" DisplayName="Col1" Group="Custom Columns" />
<Field ID="{a84d620a-d42d-455c-8ef8-7e9f1d443250}" Type="Text" AllowDeletion="FALSE" Name="Col2" DisplayName="Col2" Group="Custom Columns" />
<!-- Your new fields here -->
</Fields>
...
</MetaData>
</List>
Don't forget to change your View as well!