I am using Django, haystack, solr, to do searching. Ive am able to search and now I would like to find similar items using more_like_this. When I try to use the more_like_this functionality I get back all of the objects that are of that model type instead of just the ones that closely match it. Here is some code to show you how I am using it:
def resource_view(request, slug):
resource = Resource.objects.get(slug=slug)
versions = Version.objects.get_for_object(resource)
related = SearchQuerySet().more_like_this(resource)
add_comment_form = AddCommentForm()
return render_to_response('resources/resource.html',
{'resource': resource,
'versions': versions,
'related': related,
'add_comment_form': add_comment_form},
context_instance=RequestContext(request))
Apparently I need to enable mlt in the solrconfig.xml file. Anyone know how to do this, or an article/tutorial that is helpful?
stale question, but here's the answer anyway:
As John already pointed out, you need to add the more like this handler (MLT) to your solr config. This should do, put it somewhere in your solrconfig.xml, and reload SOLR (Tomcat):
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler">
<lst name="defaults">
<str name="mlt.mintf">1</str>
<str name="mlt.mindf">1</str>
<str name="mlt.minwl">3</str>
<str name="mlt.maxwl">15</str>
<str name="mlt.maxqt">20</str>
<str name="mlt.match.include">false</str>
</lst>
</requestHandler>
Related
Our SOAP web service provider insist on removing empty field tags from the request because it breaks the service. Is that right practice?
See below example of request. StockID is empty tag. Should it break the SOAP service?
I would like to know best practice around empty tags in request
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body xmlns:m="http://www.example.org/stock">
<m:GetStockPrice>
<m:StockName>IBM</m:StockName>
<m:StockID/>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>
This Depends on how the rest of the system is programmed. There are 3 ways of sending StockID:
Empty tag
Tag removed
Tag With nil="true"
What is probably happening for the first item above is that the program is deserializing the tag as an empty string, and then crashing since there is no Stock id = "".
In the 2 last items above it would deserialize as NULL and then not try and find a Stock id = NULL.
If one wants to extract/match Open Graph (og:) tags from html, using regex (and ColdFusion 9+), how would one go about doing it?
And the tricky bit is that is has to cover both possible variations of tag formation as in the following examples:
<meta property="og:type" content="website" />
<meta content="website" property="og:type"/>
So far all I got is this:
<cfset tags = ReMatch('(og:)(.*?)>',html_content)>
It does match both of the links, however only the first type has the content bit returned with it. And content is something that I require.
Just to make it absolutely clear, the desired output should be an array with all of the OG tags (they could be 'type,image,author,description etc.). That means it should be flexible and not based on the og:type example alone.
Of course if it's possible, the ideal output would be a struct with the first column being the name of tag, and the second containing the value (content). But that can be achieved with the post processing and is not as important as extracting the tags themselves.
Cheers,
Simon
So you want an array like ['og:author','og:type', 'og:image'...]?
Try using a regex like og:([\w]+)
That should give you a start. You will have duplicates if you have two of the same og:foo meta tags.
You can look at JSoup also to help parse the HTML for you. It makes it a lot easier.
There are a few good blog posts on using it in CFML
jQuery-like parsing in Java
Parsing, Traversing, And Mutating HTML With ColdFusion And jSoup
Ok, so after the suggestion from #abbottmw (thank you very much!), here's the solution:
Download Jsoup jar file from here: http://jsoup.org/download
Then initiate it like this:
<cfhttp url="...." result="oghtml" > /*to get your html content*/
<cfscript>
paths = expandPath("/lib/jsoup.jar"); //or wherever you decide to place the file
loaderObj =createObject("component","javaloader.JavaLoader").init([expandPath('/lib/jsoup.jar')]);
jsoup = loaderObj.create("org.jsoup.Jsoup");
doc = jsoup.parse(oghtml);
tags = doc.select("meta[property*=og:]");
</cfscript>
<cfloop index="e" array="#tags#">
<cfoutput>
#e.attr("property")# | #e.attr("content")#<br />
</cfoutput>
</cfloop>
And that is it. The complete list of og tags is in the [tags] array.
Of course it's not the regex solutions, which was originally requested, but hey, it works!
I have tried lot of the time in Submit Feed (product) in Amazon Marketplace. There is a lack of clear information and the reference document is also not good as much. It is have only the basic feed submission.
I need to create new product with size and color information. Please the code below,
<MessageType>Product</MessageType>
<Message>
<MessageID>1</MessageID>
<OperationType>Update</OperationType>
<Product>
<SKU>5000-***-**-*</SKU>
<StandardProductID>
<Type>UPC</Type>
<Value>YSjsjs899ss</Value>
</StandardProductID>
<Condition>
<ConditionType>New</ConditionType>
</Condition>
<DescriptionData>
<Title>Backout T-Shirt Light Pink Medium</Title>
<Brand>Blackout T-Shirt</Brand>
<Description>This is an sample product added by bala.</Description>
<BulletPoint>Made in Italy</BulletPoint>
<MSRP currency="USD">2.19</MSRP>
<Manufacturer>Peacock Alley</Manufacturer>
<ItemType>Novelty T-Shirts</ItemType>
</DescriptionData>
<ProductData>
<Clothing>
<VariationData>
<Parentage>child</Parentage>
<VariationTheme>SizeColor</VariationTheme>
<Size>M</Size>
</VariationData>
<SizeMap>Medium</SizeMap>
<ColorName>Light Pink</ColorName>
<ColorMap>pink</ColorMap>
<ClassificationData>
<ClothingType>Underwear</ClothingType>
<Department>mens</Department>
<ModelNumber>CM203</ModelNumber>
</ClassificationData>
</Clothing>
</ProductData>
</Product>
</Message>
But it is not working. Please guide me to do this.
Regards,
Balaganesh
I'm aware that this answer is late, but I just had another question today that led me here, and while this didn't help me, I believe I can answer this question anyway. So here goes:
You can find the XSD -- which is the xml schema file -- here: https://images-na.ssl-images-amazon.com/images/G/01/rainier/help/xsd/release_4_1/ProductClothing.xsd
The special thing about this, which is different from JSON is that this has the :sequence attribut. This means that it is important which order the elements go into the xml file.
In the ColorSize variation theme, the VariationTheme block should look like this:
<VariationData>
<Parentage>child</Parentage>
<Size>S</Size>
<Color>White</Color>
<VariationTheme>SizeColor</VariationTheme>
</VariationData>
Our Solr build is functioning as
http://192.168.1.106:8983/solr/spell?q=query&spellcheck=true&spellcheck.build=true
does successfully return spelling suggestions based off our index based dictionary
However the django-haystack variable {{suggestion}} or even python command SearchQuerySet().spelling_suggestion("query") return "None".
We use the standard view and url provided by haystack.
The install apps are
Python 2.7.2, Django 1.3.2,
Haystack 2.0, Apache Solr 3.6.1 (running on standard Jetty), PySolr 2.1.
Here is some of the code we are using:
In settings.py
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
'URL': 'http://192.168.1.106:8983/solr',
'INCLUDE_SPELLING': True,
},
}
In /PATH/TO/SOLR/example/solr/conf/solrconfig.xml:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">text</str>
<str name="spellcheckIndexDir">spellchecker</str>
</lst>
</searchComponent>
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="df">text</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">10</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
So the question is: where is the issue in the code to cause the installed app 'haystack' to not communicate with the results for spelling suggestions Solr is finding? Or in otherwords, why does haystack show no spelling suggestions while Solr provides some?
Do you have the INCLUDE_SPELLING setting defined as True in the CONNECTIONS in your Django settings file? http://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#spelling-suggestion
One thing that may be helpful is to see exactly what Haystack is sending to Solr. You can add a print statement into the SolrBackend class in Haystack's backends/solr_backend.py in the search function so you can see the URL being used. That would at least show you if Haystack is doing the search suggestion as ordered.
You may also want to check for Haystack updates direct from the Github repo. The development there is pretty active.
I was struggling with this error, and spelling_suggestion wasn't showing nothing, until I add to request handler "/select" the component spellchecker. So the default connection is as said above and the '/select' handler goes like this:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
Adittionally, I also add as attribute of my RestaurantIndex at search_indexes.py, something like:
#Suggestions - so obvious
suggestions = indexes.CharField()
def prepare(self, obj):
prepared_data = super(RestaurantIndex, self).prepare(obj)
prepared_data['suggestions'] = prepared_data['text']
return prepared_data
In this case 'text' will be your own field that will be suggestions.
After this, in python shell you could execute this and results would came up:
sqs = SearchQuerySet().auto_query('Restauront')
spelling = sqs.spelling_suggestion()
My suggestion to this is restaurant.
Cheers!
PS: If you need extra config, just say it.
I am trying to use solrj for my application, my code is given below,
query.add("q", "simplify360");
query.add("facet", "true");
query.add("facet.range", "createdOnGMTDate");
query.add("facet.range.start", "2010-08-01T00:00:00Z+330MINUTES");
query.add("facet.range.end", "2011-05-31T00:00:00Z+330MINUTES");
query.add("facet.range.gap", "+1DAY");
//query.add("wt","json");
//query.add("wt.mime-type","application/json");
System.err.println(query.toString());
The code executes fine and when i execute the url on solr server, i get the following result for faceting,
<lst name="facet_counts"><lst name="facet_queries"/>
<lst name="facet_fields"/>
<lst name="facet_dates"/>
<lst name="facet_ranges">
<lst name="createdOnGMTDate">
<lst name="counts">
<int name="2010-01-01T00:00:00Z">0</int>
<int name="2010-01-02T00:00:00Z">0</int>
<int name="2010-01-03T00:00:00Z">0</int>
<int name="2010-01-04T00:00:00Z">0</int>
<int name="2010-01-05T00:00:00Z">0</int>
<int name="2010-01-06T00:00:00Z">0</int
</lst>
<str name="gap">+1DAY</str>
<date name="start">2010-01-01T00:00:00Z</date>
<date name="end">2011-05-31T00:00:00Z</date>
</lst>
</lst>
</lst>
</response>
1) How can i retrieve these values in java,
2) Also if there is anyway i can convert the json response to the json java object
Regards,
Rohit
Using SOLRJ API, you can use following code:
QueryResponse rsp = server.query(query); // you issue the query and get resuls
Map<String, Integer> fq = rsp.getFacetQuery(); // this returns facetQuery part of response
Hope this helps ...