Is it possible to query multiple AWS Cloudsearch fields for the same value without repeating? - amazon-web-services

Using AWS Cloudsearch, I need to query 2 separate fields for the same value using a structured (compound) query e.g.
(and (or name:'john smith') (or curr_addr:'123 someplace' other_addr:'123 someplace'))
This query works, but I'm wondering if it's necessary to repeat the value for each field that I want to search against. Is there some way to specify the value only once e.g. curr_addr+other_addr:'123 someplace'

That is the correct way to structure your compound query. From the AWS documentation, you'll see that they structure their example query the same way:
(and title:'star' (or actors:'Harrison Ford' actors:'William Shatner')(not actors:'Zachary Quinto'))
From Constructing Compound Queries
You may be able to get around this by listing the more repetitive fields in the query options (q.options), and then specify the field for the rest of the fields. The fields list is sort of a fallback for when you don't specify which field you are searching in a compound query. So if you list the address fields there, and then only specify the name field in your query, you may get close to the behavior you're looking for.
Query options
q.options={fields: ['curr_addr','other_addr']}
Query
(and (or name:'john smith') (or '123 someplace'))
But this approach would only work for one set of repetitive fields, so it's not a silver bullet by any means.
From Search API Reference (see q.options => fields)

Related

Quicksight breaking up strings for use of all aspects

I was wondering if anyone has every had experience with breaking a string up in quicksight and using certain aspects of the string. My example is a data set that returns tags like this "animals|funny|dog-park" I have used "split(tags,'|',1)" but then all that gets returned is the first part(animals). I have also tried a combination of ifelse->locate->split with no luck. Is there a way to split these tags to where they are all usable (animals) & (funny) or (funny) & (dog-park), etc.? Say the article associated will then be broken up into one tag but also another separately? I know this will end up being a calculated field most likely. Thank you in advance!
Since QuickSight does not support any form of nested fields (including objects and list) in analysis, you need to normalise this into separate rows before feeding the data to QuickSight.
Otherwise, if you leave it as is, you would be limited to filtering using string contains and doing string lookup in calculated fields - nevertheless you would not be able to use these tags as categories (such as in colours field well of visuals).

How to order django query set filtered using '__icontains' such that the exactly matched result comes first

I am writing a simple app in django that searches for records in database.
Users inputs a name in the search field and that query is used to filter records using a particular field like -
Result = Users.objects.filter(name__icontains=query_from_searchbox)
E.g. -
Database consists of names- Shiv, Shivam, Shivendra, Kashiva, Varun... etc.
A search query 'shiv' returns records in following order-
Kahiva, Shivam, Shiv and Shivendra
Ordered by primary key.
My question is how can i achieve the order -
Shiv, Shivam, Shivendra and Kashiva.
I mean the most relevant first then lesser relevant result.
It's not possible to do that with standard Django as that type of thing is outside the scope & specific to a search app.
When you're interacting with the ORM consider what you're actually doing with the database - it's all just SQL queries.
If you wanted to rearrange the results you'd have to manipulate the queryset, check exact matches, then use regular expressions to check for partial matches.
Search isn't really the kind of thing that is best suited to the ORM however, so you may which to consider looking at specific search applications. They will usually maintain an index, which avoids database hits and may also offer a percentage match ordering like you're looking for.
A good place to start may be with Haystack

How to build query form to request AWS CloudSearch?

I have a SearchDomain on AWS CloudSearch. I know all the defined facets names.
I would like to build a web query form to use it, but I want to add my categories values (facets) on the side, like it is done on Amazon webstore
The only way I have to get facets values is to make a query (params query) and in the answer will contain facets linked to my query results.
Is there a way to fetch all the facet.FIELD possible values to build the query form ?
If not (as read here), how to design a form using facets ?
You could also use the matchall keyword in a structured query. Also, since you don't need the results you can pass size=0 so you only get the facets which will reduce latency.
/2013-01-01/search?q=matchall&q.parser=structured&size=0&facet.field={sort:'count',size:100}

Overcoming Exclude List Limit Size

I'm trying to make a query using Django's Exclude() and passing to it a list, as in:
(...).exclude(id__in=list(top_vip_deals_filter))
The problem is that, apparently, there is a Limit -- depending on your database --on the size of the list being passed.
Is this correct?
If so, How to overcome this?
If not, is there some explanation to the fact that queries silently fail when the list size is big?
Thanks
If the top_vip_deals_filter comes from the database, you can set an extra where in the query:
(...).extra(where=['model.id not in select blah blah'])
(put your lowercase model name instead of model.)
You can do better if the data model allows you to. If you can do it in SQL, you probably can do it in django.

How to limit columns returned by Django query?

That seems simple enough, but all Django Queries seems to be 'SELECT *'
How do I build a query returning only a subset of fields ?
In Django 1.1 onwards, you can use defer('col1', 'col2') to exclude columns from the query, or only('col1', 'col2') to only get a specific set of columns. See the documentation.
values does something slightly different - it only gets the columns you specify, but it returns a list of dictionaries rather than a set of model instances.
Append a .values("column1", "column2", ...) to your query
The accepted answer advising defer and only which the docs discourage in most cases.
only use defer() when you cannot, at queryset load time, determine if you will need the extra fields or not. If you are frequently loading and using a particular subset of your data, the best choice you can make is to normalize your models and put the non-loaded data into a separate model (and database table). If the columns must stay in the one table for some reason, create a model with Meta.managed = False (see the managed attribute documentation) containing just the fields you normally need to load and use that where you might otherwise call defer(). This makes your code more explicit to the reader, is slightly faster and consumes a little less memory in the Python process.