I have a SearchDomain on AWS CloudSearch. I know all the defined facets names.
I would like to build a web query form to use it, but I want to add my categories values (facets) on the side, like it is done on Amazon webstore
The only way I have to get facets values is to make a query (params query) and in the answer will contain facets linked to my query results.
Is there a way to fetch all the facet.FIELD possible values to build the query form ?
If not (as read here), how to design a form using facets ?
You could also use the matchall keyword in a structured query. Also, since you don't need the results you can pass size=0 so you only get the facets which will reduce latency.
/2013-01-01/search?q=matchall&q.parser=structured&size=0&facet.field={sort:'count',size:100}
Related
Using AWS Cloudsearch, I need to query 2 separate fields for the same value using a structured (compound) query e.g.
(and (or name:'john smith') (or curr_addr:'123 someplace' other_addr:'123 someplace'))
This query works, but I'm wondering if it's necessary to repeat the value for each field that I want to search against. Is there some way to specify the value only once e.g. curr_addr+other_addr:'123 someplace'
That is the correct way to structure your compound query. From the AWS documentation, you'll see that they structure their example query the same way:
(and title:'star' (or actors:'Harrison Ford' actors:'William Shatner')(not actors:'Zachary Quinto'))
From Constructing Compound Queries
You may be able to get around this by listing the more repetitive fields in the query options (q.options), and then specify the field for the rest of the fields. The fields list is sort of a fallback for when you don't specify which field you are searching in a compound query. So if you list the address fields there, and then only specify the name field in your query, you may get close to the behavior you're looking for.
Query options
q.options={fields: ['curr_addr','other_addr']}
Query
(and (or name:'john smith') (or '123 someplace'))
But this approach would only work for one set of repetitive fields, so it's not a silver bullet by any means.
From Search API Reference (see q.options => fields)
I just wondering is there a way to specify returned fields for search request to the backend elasticsearch. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html for how to specify list in JSON API.
Let me explain why i need this. I have lots of articles with a large text data. Searching in this case is very slow, cause elasticsearch returns a whole large texts for each search results, but i want to render only titles except a whole text.
May be is there another way to do it?
There are multiple options here
You can use the fields option in Elasticsearch to specify the list of fields value that has to be returned. This will save some latency time as only less data has to be transported back. But then actual data would be stored as _source and it has to be fetched from hard disk and deserialized for each call.
LINK - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
In case we don't want to retrieve this field but you just want that field to be searchable. You can disable _source and enable store for each field whose data needs to be retrievable.
LINK , _source - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-source-field.html
LINK , store - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html
Djanko haystack documentation - http://django-haystack.readthedocs.org/en/latest/searchresult_api.html#SearchResult.get_additional_fields
I've put together a simple search form, with a search box and a couple of filters as dropdowns. Everything works as you'd expect, except that I want the behavior to be that when the user leaves everything completely blank (no search query, no filters) they simply get everything returned (paginated of course).
I'm currently achieving this by detecting this special case and querying my local database, but there are some advantages to doing it 100% with CloudSearch. Is there a way to build a request that simply returns a paginated list of every document? In other words, is there a CloudSearch equivalent to "SELECT id FROM x LIMIT n?"
Thanks in advance!
Joe
See the Search API.
?q=matchall&q.parser=structured will match all the documents.
These easiest way would be to use a not operator, so for example:
?q=dog|-dog
would return all documents that contained 'dog' and also did not contain 'dog'. You would need to intercept the special case, as you are already, and just substitute a query/not query combo and you should get everything back.
For someone looking for an answer using boto3.
CLOUD_SEARCH_CLIENT = boto3.client(
'cloudsearchdomain',
aws_access_key_id='',
aws_secret_access_key='',
region_name='',
endpoint_url="https://search-your-endpoint-url.amazonaws.com"
)
response = CLOUD_SEARCH_CLIENT.search(
query="matchall",
queryParser='structured'
)
print(response)
I'm listing queryset results and would like to add an option for choosing the order results are displayed.
I would like to pass the actual data from the database to other page for sorting.
I was able to achieve such thing by getting all objects ids and use django session to recreate a new queryset based on the order criteria.
I was thinking if there is any other way to achieve such goal?
10x
Assuming you are currently displaying the data as a table, you could give chance to some javascript client side table sorter such as tablesorter. There are lots of javascript table sorte.
I'm away from my development machine right now, but I think you could just pass the list of ids to a new Queryset, pk__in=list_of_object_ids, and then use the native order_by function.
For example:
objs = Object.objects.filter(pk__in=list_of_object_ids).order_by('value_to_order_by')
Anyway, that's what I would try first, though I'm sure there are better optimizations.
For example, instead of a list of object ids, you could pass a dictionary with a key:value pair that has the value you want to order by.
For example:
[{'obj_id':1,'obj_value':'foo'},{'obj_id':2,'obj_value':'foo'}]
Then use some lambda function to sort it, like here.
I currently have a .NET method that looks like this - GetUsers(Filter filter) and this is invoked from the client by sending a SOAP request. As you can probably guess, it takes a bunch of filters and returns a list of users that match those filters. The filters aren't too complicated, they're basically just a set of from date, to date, age, sex etc. and the set of filters I have at any point is static.
I was wondering what the RESTful way of doing this was. I'm guessing I'll have a Users resource. Will it be something like GET /Users?fromDate=11-1-2011&toDate=11-2-2011&age=&sex=M ? Is there a way to pass it a Filter without having to convert it into individual attributes?
I'm consuming this service from a mobile client, so I think the overhead of an extra request that actually creates a filter: POST filters is bad UX. Even if we do this, what does POST filters even mean? Is that supposed to create a filter record in the database? How would the server keep track of the filter that was just created if my sequence of requests is as follows?
POST /filters -- returns a filter
{
"from-date" : "11-1-2011",
"to-date" : "11-2-2011",
"age" : "",
"sex" : "M"
}
GET /Users?filter=filter_id
Apologies if the request came off as being a little confused. I am.
Thanks,
Teja
We are doing it just like you had it
GET /Users?fromDate=11-1-2011&toDate=11-2-2011&age=&sex=M
We have 9 querystring values.
I don't see any problem with that
The way I handle it is I do a POST with the body containing the parameters and then I return a redirect to a GET. What the GET URL looks like is completely up to the server. If you want to convert the filter into separate query params you can, or if you want to persist a filter and then add a query param that points to the saved filter that's ok too. The advantage is that you can change your mind at any time and the client doesn't care.
Also, because you are doing a GET you can take advantage of caching which should more than make up for doing the extra retquest.