SOLR Java APi - Adding multiple condition - solrj

I am using Solr API to index the records and for search functionality. I am using the following code to search through
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("q", "country_id:("+id+")");
I would like to add one more parameter like state_id and I would like to do logical AND/OR operations and depending on the result the records should be retrieved. I searched through Google, but could not find a way to combine the conditions. Is it possible through the SOLR api? Or am I doing something wrong?

You can make Your Query as Following.....
String qr="cstm_text:"+"devotional";
SolrQuery qry = new SolrQuery(qr);
qry.setIncludeScore(true);
qry.setShowDebugInfo(true);
qry.setRows(1);
qry.setFacet(true);
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/spell");
params.set("spellcheck", "on");
params.set(MoreLikeThisParams.MLT,true);
qry.add(params);

Related

Tweepy API search doesn't have keyword

I am working with Tweepy (python's REST API client) and I'm trying to find tweets by several keywords and without url included in tweet.
But search results are not up to our satisfaction. Looks like query has erros and was stopped. Additionally we had observed that results were returned one-by-one not (as previously) in bulk packs of 100.
Could you please tell me why this search does not work properly?
We wanted to get all tweets mentioning 'Amazon' without any URL links in the text.
We used search shown below. Search results were still containing tweets with URLs or without 'Amazon' keyword.
Could you please let us know what we are doing wrong?
auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
searchQuery = 'Amazon OR AMAZON OR amazon filter:-links' # Keyword
new_tweets = api.search(q=searchQuery, count=100,
result_type = "recent",
max_id = sinceId,
lang = "en")
The minus sign should be put before "filter", not before "links", like this:
searchQuery = 'Amazon OR AMAZON OR amazon -filter:links'
Also, I doubt that the count = 100 option is a valid one, since it is not listed on the API documentation (which may not be very up-to-date, though). Try to replace that with rpp = 100 to get tweets in bulk packs.
I am not sure why some of the tweets you find do not contain the "Amazon" keyword, but a possibility is that "Amazon" is contained within the username of the poster. I do not know if you can filter that directly in the query, or even if you would want to filter it, since it would mean you would reject tweets from the official Amazon accounts. I would suggest that, for each tweet the query returns, you check it to make sure it does contain "Amazon".

what's best approach permalink database with zf2 and doctrine2?

i'm building a web site with zf2 and doctrine. How can i save permalink to the my database? What's the best approach way? Should i save permalink in my entities?
//Product Doctirine Entity
public function setProdName($prodName){
$this->prodName = $prodName;
//is this right way?
$this->setSeoLink = urlhelper->url($prodName);
}
I suggest you to create your urls using hash, like
/this-is-the-product-title-xyz123.html
where xyz123 is the hash of your product.
Doing this you'll have these advantage
1) you can change your product title anytime, your product can always be reached even the search engine or links in other website has the old one, for instance both
/this-is-my-product-xyz123.html
and
/this-is-my-wonderful-product-xyz123.html
will work.
2) you are not showing the real id to people.
To accomplish this I used the HashId module, creating a factory service that return an already configurated object, then you can encrypt/descript using this code:
$hashids = $this->getServiceLocator()->get('my.service.hashid');
// decrypt having hash
$id = $hashids->decrypt($hash);
// encrypt having id
$hash = $hashids->encrypt($id);
The .html suffix is just as example, you can create your url as you wish.

Partial text search with postgreql & django

Tried to implement partial text search with postgresql and django,used the following query
Entry.objects.filter(headline__contains="search text")
This returns records having exact match,ie suppose checking for a match against the record "welcome to the new world" with query __contains="welcome world" , returns zero records
How can i implement this partial text search with postgresql-8.4 and django?
If you want this exact partial search you can use the startswitch field lookup method: Entry.objects.filter(headline__startswith="search text"). See more info at https://docs.djangoproject.com/en/dev/ref/models/querysets/#startswith.
This method creates a LIKE query ("SELECT ... WHERE headline LIKE 'search text%'") so if you're looking for a fulltext alternative you can check out PostgreSQL's built in Tsearch2 extension or other options such as Xapian, Solr, Sphinx, etc.
Each of the former engines mentioned have Django apps that makes them easier to integrate: Djapian for Xapian integration or Haystack for multiple integrations in one app.

Amazon Product Advertising API: Get Average Customer Rating

When using Amazon's web service to get any product's information, is there a direct way to get the Average Customer Rating (1-5 stars)? Here are the parameters I'm using:
Service=AWSECommerceService
Version=2011-08-01
Operation=ItemSearch
SearchIndex=Books
Title=A Game of Thrones
ResponseGroup=Large
I would expect it to have a customer rating of 4.5 and total reviews of 2177. But instead I get the following in the response.
<CustomerReviews><IFrameURL>http://www.amazon.com/reviews/iframe?...</IFrameURL></CustomerReviews>
Is there a way to get the overall customer rating, besides for reading the <IFrameURL/> value, making another HTTP request for that page of reviews, and then screen scraping the HTML? That approach is fragile since Amazon could easily change the reviews page structure which would bust my application.
You can scrape from here. Just replace the asin with what you need.
http://www.amazon.com/gp/customer-reviews/widgets/average-customer-review/popover/ref=dpx_acr_pop_?contextId=dpx&asin=B000P0ZSHK
As far as i know, Amazon changed it's API so its not possible anymore to get the reviewrank information. If you check this Link the note sais:
As of November 8, 2010, only the iframe URL is returned in the request
content.
However, testing with the params you used to get the Iframe it seems that now even the Iframe dosn't work anymore. Thus, even in the latest API Reference in the chapter "Motivating Customers to Buy" the part "reviews" is compleatly missing.
However: Since i'm also very interested if its still possible somehow to get the reviewrank information - maybe even not using amazon API but a competitors API to get review rank informations - i'll set up a bounty if anybody can provide something helpful on that. Bounty will be set in this topic in two days.
You can grab the iframe review url and then use css to position it so only the star rating shows. It's not ideal since you're not getting raw data, but it's an easy way to add the rating to your page.
Sample of this in action - http://spamtech.co.uk/positioning-content-inside-an-iframe/
Here is a VBS script that would scrape the rating. Paste the code below to a text file, rename it to Test.vbs and double click to run on Windows.
sAsin = InputBox("What is your ASIN?", "Amazon Standard Identification Number (ASIN)", "B000P0ZSHK")
if sAsin <> "" Then
sHtml = SendData("http://www.amazon.com/gp/customer-reviews/widgets/average-customer-review/popover/ref=dpx_acr_pop_?contextId=dpx&asin=" & sAsin)
sRating = ExtractHtml(sHtml, "<span class=""a-size-base a-color-secondary"">(.*?)<\/span>")
sReviews = ExtractHtml(sHtml, "<a class=""a-size-small a-link-emphasis"".*?>.*?See all(.*?)<\/a>")
MsgBox sRating & vbCrLf & sReviews
End If
Function ExtractHtml(sHtml,sPattern)
Set oRegExp = New RegExp
oRegExp.Pattern = sPattern
oRegExp.IgnoreCase = True
Set oMatch = oRegExp.Execute(sHtml)
If oMatch.Count = 1 Then
ExtractHtml = Trim(oMatch.Item(0).SubMatches(0))
End If
End Function
Function SendData(sUrl)
Dim oHttp 'As XMLHTTP30
Set oHttp = CreateObject("Msxml2.XMLHTTP")
oHttp.open "GET", sUrl, False
oHttp.send
SendData = Replace(oHttp.responseText,vbLf,"")
End Function
Amazon has completely removed support for accessing rating/review information from their API. The docs mention a Response Element in the form of customer rating, but that doesn't work either.
Google shopping using Viewpoints for some reviews and other sources
This is not possible from PAPI. You either need to scrape it by yourself, or you can use other free/cheaper third-party alternatives for that.
We use the amazon-price API from RapidAPI for this, it supports price/rating/review count fetching for up to 1000 products in a single request.

Deleting index from Solr using solrj as a client

I am using solrj as client for indexing documents on the solr server.
I am having problem while deleting the indexes by 'id' from the solr server.
I am using following code to delete the indexes:
server.deleteById("id:20");
server.commit(true,true);
After this when i again search for the documents, the search result contains the above document also. Dont know what is going wrong with this code.
Please help me out with issue.
Thanks!
When you call deleteById, just use the id, without query syntax:
server.deleteById("20");
server.commit();
After you delete the document, commit the server and add the following lines.
After the server commit line.
UpdateRequest req = new UpdateRequest();
req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
req.add( docs );
UpdateResponse rsp = req.process( server );
Use the method deleteByQuery() to delete the documents matching the query:
server.deleteByQuery("id:20");
server.commit();
So the deleteById will work only if you are forming your key using only one attribute. So, I had case where the id was a combination of multiple attributes like employeeId+deptId. But, my table had employeeId & deptId as separate columns as well with indexes created on it. So when I wanted to delete a record I had only the employeeId and not deptId. I used the curl command to delete where you can specify the column and its value and it will delete the entire record.
E.g.
curl http://localhost:8983/solr/update --data ':' -H 'Content-type:text/xml; charset=utf-8'