Leading Wildcard Multiword in Solr Not working - solrj

I am trying for multiword leading wild card search in solr.
Ex: *oogle - works for me
*oogle maps - doesn't work for me
Can you provide me the valid query and solution for this issue,
schema.xml
<field name="abc_name" type="text_app" indexed="true" stored="true"/>
<field name="abc_address" type="text_app" indexed="true" stored="true"/> <field name="abc_notes" type="text_app" indexed="true" stored="true"/>
<field name="abc_allSearchFields" type="text_app" indexed="true" stored="false" multiValued="true"/>
<copyField source="abc_*" dest="abc_allSearchFields"/>
<fieldType name="text_app" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
In the query,
q=abc_allSearchFields:"google maps" ---- Will fetch all results that match google maps together. Works for me.
q=abc_allSearchFields:*ogle ---> will fetch all records that has words that end with ogle. Works for me.
q=abc_allSearchFields:"*gle maps" ---- This is fetching all record words that end with gle and maps separately. But my requirement is to get all records containing "*gle maps" together only like "google maps".
Can some one help me with the solution for the 3rd case.

You need to include a ReversedWildcardFilterFactory in the index analyzer chain of the corresponding field type to make that work.
But beware of this issue:
https://issues.apache.org/jira/browse/SOLR-3193

Related

Solr query not working for field type "text"

I am using solr data base for storing some tags information. To store the data I am using solr "text" field with below definition:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
</fieldType>
Below is the stored data:
{
"id": 568658sheib,
"created_at":"2019-02-21T12:45:49Z",
"tags":"tagone_tagsuserinfo_tagsnameuserinfo,tagtwo_tagsuserinfo_tagsnameuserinfo,tagthree_tagsuserinfo_tagsnameuserinfo,tagfour_tagsuserinfo_tagsnameuserinfo,tagfive_tagsuserinfo_tagsnameuserinfo,tagsix_tagsuserinfo_tagsnameuserinfo,tagsseven_tagsuserinfo_tagsnameuserinfo,tagseight_tagsuserinfo_tagsnameuserinfo,tagsnine_tagsuserinfo_tagsnameuserinfo,tagsten_tagsuserinfo_tagsnameuserinfo,tagsleven_tagsuserinfo_tagsnameuserinfo,tagstwelve_tagsuserinfo_tagsnameuserinfo,tagsthirteen_tagsuserinfo_tagsnameuserinfo,tagsfourteen_tagsuserinfo_tagsnameuserinfo"
}
I am getting the result when I query like below:
id:568658sheib AND tags:tagone_tagsuserinfo_tagsnameuserinfo
id:568658sheib AND tags:tagsfourteen_tagsuserinfo_tagsnameuserinfo
But randomly if I pickup 2 tags like below result is not coming.
tags:tagseight_tagsuserinfo_tagsnameuserinfo AND tags:tagsfourteen_tagsuserinfo_tagsnameuserinfo
Its very confusing. I have tried all kind of queries but no help. Kindly help with this.

solr index field with preserving white spaces and searching it gives no result

i am using solr with django.In my schema i have a field
<field name="function" type="text" indexed="true" stored="true" multiValued="true" />
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> -->
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
i have indexed values like Wedding shoot,Wedding,Reception,Pre Wedding,Cocktail,Post Wedding,Engagement
i want to search for functions where the value is "Wedding shoot" when i use () it gives me values where both "Wedding shoot" and "Wedding" is present
http://localhost:8983/solr/realwedding/select?q=function%3A(wedding+shoot)&rows=100&fl=function&wt=json&indent=true
and if i use "" it returns nothing
http://localhost:8983/solr/realwedding/select?q=function%3A%22wedding+shoot%22&rows=100&fl=function&wt=json&indent=true
what i want is it to give results where it matches the full text "Wedding shoot"
thanks in advance
Two tools are your friends here:
The "Field Analysis" page on the admin console, where you can see what happens at index and query time
The debug parameter, (or the [explain] transformer) which gives you a perspective of how Solr "sees" and "executes" your query and why a given document is in the search results.
On top of that, it's quite hard to guess what's going on there. The answer should be in the schema and in the definition of the RequestHandler which is serving that request.

Solr query returns 0 results

I have a few documents indexed in Solr. When I query using q=*:*, I get all the documents but when I send some word to q, I get no results. Below is the snippet of schema.xml
<?xml version="1.0" ?>
<schema name="default" version="1.5">
<types>
<fieldtype name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<fieldtype name="binary" class="solr.BinaryField"/>
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<!-- <fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/> -->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/>
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>
<fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/>
<fieldType name="text" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<!-- <analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> -->
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer> -->
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<!-- <analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> -->
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
<filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
<!-- <filter class="solr.PorterStemFilterFactory"/> -->
<!-- </analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> -->
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
<filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
<!-- <filter class="solr.PorterStemFilterFactory"/>
</analyzer> -->
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldType name="ngram" class="solr.TextField" >
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
</analyzer>
</fieldType>
</types>
<fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored ="true"/>
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
<dynamicField name="*_s" type="string" indexed="true" stored="true"/>
<dynamicField name="*_l" type="long" indexed="true" stored="true"/>
<dynamicField name="*_t" type="text_en" indexed="true" stored="true"/>
<dynamicField name="*_b" type="boolean" indexed="true" stored="true"/>
<dynamicField name="*_f" type="float" indexed="true" stored="true"/>
<dynamicField name="*_d" type="double" indexed="true" stored="true"/>
<dynamicField name="*_dt" type="date" indexed="true" stored="true"/>
<dynamicField name="*_p" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
<field name="content" type="text_en" indexed="true" stored="true" multiValued="false" />
<field name="title" type="text_en" indexed="true" stored="true" multiValued="false" />
<field name="text" type="text_en" indexed="true" stored="true" multiValued="false" />
<field name="image" type="text_en" indexed="true" stored="true" multiValued="false" />
<field name="short_desc" type="text_en" indexed="true" stored="true" multiValued="false" />
<field name="pub_date" type="text_en" indexed="true" stored="true" multiValued="false" />
</fields>
<!-- field to use to determine and enforce document uniqueness. -->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>text</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="OR"/>
</schema>
What could I possibly be doing wrong?!
EDIT
Here is a sample of the document indexed in Solr.
And here is the query I ran that gave me 0 results:
As you can clearly see the document has India mentioned. So this document should have been returned. Is there something wrong with the query generated?
Either you will have to fire your query on a field name like below
q=:content:india
or you will have to define a defaults fields to be searched for a blank query string for your select handler in solrconfig file as below
<requestHandler name="/select" class="solr.SearchHandler">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<int name="rows">10</int>
<str name="qf">content short_description</str>
</lst>
</requestHandler>
It would have been good if you had shared the definition of field type as in whats the tokenizer been used, which all filters are used etc...
If you have used the keyword tokenizer which is the tokenizer that treats the entire text field as a single token.
Try by using the StandardTokenizerFactory or WhitespaceTokenizerFactory.
In case WhitespaceTokenizerFactory , tokenizer that splits the text stream on whitespace and returns sequences of non-whitespace characters as tokens. Note that any punctuation will be included in the tokenization.
If your input stream is : "The success of Republic Day in India"
Output is : "The", "success", "of", "Republic", "Day", "in", "India"
Again if you add any filter like stopword filter or lowercase filter that would again be good.
As an example
<fieldType name="text" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Here the final output would be different
If your input stream is : "The success of Republic Day in India"
Output is : "the", "success", "of", "republic", "day", "in", "india"
and now your can query by "India" as well as "india"... it will get a match
because while indexing you indexed it as "india" and while quering you have the lowercase filter which will make it to "india" even if the search text is "India".
On top of it if you add stopword filter factory
it will not index words like : "of", "the", "in" and search on those words is not meaningful(Its my opinion, may vary from others).
The solr has provided a web interface, where in one you can analyse your fieldtypes, who it is indexing the stream ...what all you need to change so the you get the right result.
I hope this helps...
For more information on all tokenizers and filters please have a look at it ..
https://cwiki.apache.org/confluence/display/solr/Tokenizers#Tokenizers-WhiteSpaceTokenizer
https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions
In these cases I'd add debugQuery=true parameter to my http request. The displayed information includes how Solr sees the q parameter so you should be able to get what's going wrong. Shooting in the dark I guess documents are not actually indexed or you're using a wrong query parser (e.g. *:* is not a valid query for DisMax)
After you post has been updated I see a strange thing (but maybe I could be wrong, I'm reading this looong post from my mobile):
nothing fills the "text" field...
the document you're looking for has the "india" term in the "content" field, but the df (default field used in queries) is "text" so this is the correct behaviour, nothing matches "india" in "text" because "text" is empty. You could do one of the following:
change the default field from text to content
explicitly name the content field in your query (e.g. content:india)
Declare a copyField directive with src=content and dst=text

Sunspot Solr highlight: Always highlight from the beginning

I need Highlights to be from the beginning of the sentence.
For example:
Indexed Field:
I am drinking beer in Downtown Pub.
If the search query is drinking beer, the returned result is:
I am drinking beer in Downtown Pub.
What I want is:
drinking beer in Downtown Pub.
I am using following Configuration in schema.xml:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="35" side="front" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I think I need to use some configuration of highlighter. But I cannot figure out what and where to put that configuration. I have gone through official document here and here but cannot make it work.
I am using sunspot. Any help is appreciated. You may need to be more explicit as I am relatively new to this.

Solr Analyzer PatternReplaceCharFilterFactory is not taken in consideration. (maybe cause of ngram or multivalued)

here is my problem.
I have to normalized address data to strip out th or st.
string example: 35 West 15th Street
I can not just use synonym cause the th/st are part of the "word" 15th so I need to use the
solr.PatternReplaceCharFilterFactory
here is my schema entries:
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]{1,})(st |th |ST |TH )" replacement="$1 " />
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="15" />
<!--filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/-->
<!--filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/-->
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="([0-9]{1,})(st |th |ST |TH )" replacement="$1 " />
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<!--filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /-->
<!--filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/-->
</analyzer>
</fieldType>
<field name="building_search_text" type="text_ngram" indexed="true" stored="true" multiValued="true"/>
my field is multivalued cause I also include the building_name and other text.
it seems that the PatternReplaceCharFilterFactory works when I try it with the admin interface -> analyze. cause I get this result when I test with "35 West 15th Street"
PRCF text 35 West 15 Street
for both, query and index.
but when I query I get this output:
"building_search_text": [
"259 West 15th Street, 259 West 15th Street",
"259 West 15th Street"
],
At query time it also doesn't working as expected.
Query: item_type:Building AND building_search_text:(35 West 15th Street)
Here is the output of the query debug: (the th is not stripped)
"debug": {
"rawquerystring": "item_type:Building AND building_search_text:(35 West 15th Street)",
"querystring": "item_type:Building AND building_search_text:(35 West 15th Street)",
"parsedquery": "+item_type:Building +(building_search_text:35 building_search_text:west building_search_text:15th building_search_text:street)",
"parsedquery_toString": "+item_type:Building +(building_search_text:35 building_search_text:west building_search_text:15th building_search_text:street)",
I'm not sure if it's a bug that could be related to multivalued field of if I'm doing something wrong.
someone have an Idea?
Why not use a http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory (splitOnNumerics="1") so streetnames like 22nd and 3rd are also split into a number and letter part?
here is the response to my own problem.
I've use the wrong tokenizer.
here is the new fieldType definition:
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([0-9]{1,})(st|th)\s?" replacement="$1 " replace="all" />
<filter class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="15" />
<!--filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" enablePositionIncrements="true" /-->
<!--filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/-->
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([0-9]{1,})(st|th)\s?" replacement="$1 " replace="all" />
<!--filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /-->
<!--filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/-->
</analyzer>
</fieldType>