How to correctly apply MethodResponse to "filter" AWS api gateway response - amazon-web-services

When trying to apply MethodResponse template I am failing to see any difference in final response. My goal is to successfully apply schema with minItems and maxItems for array property.
Example response from lambda method:
{
"_id": "5d5110f52e8b560af82dec69",
"index": 0,
"friends": [
{
"id": 0,
"name": "Mcconnell Pugh"
},
{
"id": 1,
"name": "Peggy Caldwell"
},
{
"id": 2,
"name": "Jocelyn Mccarthy"
}
]
}
Schema I have tried to apply in MethodResponse:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title" : "Empty Schema",
"type" : "object",
"properties" : {
"friends" : {
"type" : "array",
"minItems" : 1,
"maxItems" : 2,
"items" : {
"type" : "object",
"properties" : {
"name" : {
"type" : "string"
},
"id": {
"type" : "integer"
}
}
}
},
"index" : {
"type" : "string"
}
}
}
I would expect to see only two "friends" in final response, not all of them.

After long research and a lot of AWS documentations I have found that:
It support JSON Schema 4, however not all features are supported -> related docs
Method Response basically does not apply validation. In my understanding, it just bring use if you want to export your API to Swagger to have well described specs -> related docs last Paragraph is important
So final answer to my question would be - you just cannot use Method Response as a filter, that's not the purpose of it.

Related

Why doesn't the Keyword analyzer applied to a Text field return results when the pattern contains a dash in Regexp search query?

I have created a small example to demonstrate the specific issue I'm having. Briefly, when I create a multi-field mapping using a field type of Text and the Keyword analyzer, no documents are returned from an Elasticsearch Regexp search query that contains punctuation. I use a dash in the following example to demonstrate the problem.
I’m using Elasticsearch 7.10.2. The index I’m targeting is already populated with millions of documents. The field of type Text where I need to run some regular expressions uses the Standard (default) analyzer. I understand that, because the field gets tokenized by the Standard analyzer, the following request:
POST _analyze
{
"analyzer" : "default",
"text" : "The number is: 123-4576891-73.\n\n"
}
will yield three words: "the", "number", "is" and three groups of numbers: "123", "4567891", "73". It's obvious that a regular expression that relies on punctuation, like this one that contains two literal dashes:
"(.*[^a-z0-9_])?[0-9]{3}-[0-9]{7}-[0-9]{2}([^a-z0-9_].*)?"
will not return a result. Note, for those not familiar with this, regex shortcuts do not work for Lucene-based Elasticsearch requests (at least not yet). Here's a reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html. Also, the use of word boundaries that I show in my examples (.*[^a-z0-9_])? and ([^a-z0-9_].*)? are from this post: Word boundary in Lucene regex.
To see this for yourself with an example, create and populate an index like so:
PUT /index-01
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"properties": {
"text": { "type": "text" }
}
}
}
POST index-01/_doc/
{
"text": "The number is: 123-4576891-73.\n\n"
}
The following Regexp search query will return nothing because of the tokenization issue I described earlier:
POST index-01/_search
{
"size": 1,
"query": {
"regexp": {
"text": {
"value": "(.*[^a-z0-9_])?[0-9]{3}-[0-9]{7}-[0-9]{2}([^a-z0-9_].*)?",
"flags": "ALL",
"case_insensitive": true,
"max_determinized_states": 100000
}
}
},
"_source": false,
"highlight": {
"fields": {
"text": {}
}
}
}
Most posts suggest a quick fix would be to target the Keyword type multi-field instead of the text field. The Keyword multi-type field gets created automatically, as this shows:
GET index-01/_mapping/field/text
response:
{
"index-01" : {
"mappings" : {
"text" : {
"full_name" : "text",
"mapping" : {
"text" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
Targeting the keyword field, I get return results for the following Regexp search query:
POST index-01/_search
{
"size": 1,
"query": {
"regexp": {
"text.keyword": {
"value": "(.*[^a-z0-9_])?[0-9]{3}-[0-9]{7}-[0-9]{2}([^a-z0-9_].*)?",
"flags": "ALL",
"case_insensitive": true,
"max_determinized_states": 100000
}
}
},
"_source": false,
"highlight": {
"fields": {
"text.keyword": {}
}
}
}
here's the hit-highlighted part of the result:
...
"highlight" : {
"text.keyword" : [
"<em>This is my number 123-4576891-73. Thanks\n\n</em>"
]
}
...
Because some of the documents have a large amount of text, I adjusted the text.keyword field size with ignore_above parameter:
PUT /index-01/_mapping
{
"properties": {
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 32766
}
}
}
}
}
However, this will skip some documents since the targeted index, contains larger text fields than this upper-bound for a field type Keyword. Also, according to the Elasticsearch documentation here: https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html, this type of field is really designed for structured data, constant values and wildcard queries.
Following that guidance, I assigned the Keyword analyzer to a new field type Text (text.raw) by making this update to the mapping:
PUT /index-01/_mapping
{
"properties": {
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 32766
},
"raw": {
"type": "text",
"analyzer": "keyword",
"index": true
}
}
}
}
}
Now, you can see the additional mapping text.raw with this request:
GET index-01/_mapping/field/text
response:
{
"index-01" : {
"mappings" : {
"text" : {
"full_name" : "text",
"mapping" : {
"text" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 32766
},
"raw" : {
"type" : "text",
"analyzer" : "keyword"
}
}
}
}
}
}
}
}
Next, I verified that the data was, in fact, mapped to the multi-fields:
POST index-01/_search
{
"query":
{
"match_all": {}
},
"fields": ["text", "text.keyword", "text.raw"]
}
response:
...
"hits" : [
{
"_index" : "index-01",
"_type" : "_doc",
"_id" : "2R-OgncBn-TNB4PjXYAh",
"_score" : 1.0,
"_source" : {
"text" : "The number is: 123-4576891-73.\n\n"
},
"fields" : {
"text" : [
"The number is: 123-4576891-73.\n\n"
],
"text.keyword" : [
"The number is: 123-4576891-73.\n\n"
],
"text.raw" : [
"The number is: 123-4576891-73.\n\n"
]
}
}
]
...
I also verified that the Keyword analyzer applied to the text.raw field contains a single token, as shown in the following request:
POST _analyze
{
"analyzer" : "keyword",
"text" : "The number is: 123-4576891-73.\n\n"
}
response:
{
"tokens" : [
{
"token" : "The number is: 123-4576891-73.\n\n",
"start_offset" : 0,
"end_offset" : 32,
"type" : "word",
"position" : 0
}
]
}
However, the exact same Regexp search query targeting the text.raw field returns nothing:
POST index-01/_search
{
"size": 1,
"query": {
"bool": {
"must": [
{
"regexp": {
"text.raw": {
"value": "(.*[^a-z0-9_])?[0-9]{3}-[0-9]{7}-[0-9]{2}([^a-z0-9_].*)?",
"flags": "ALL",
"case_insensitive": true,
"max_determinized_states": 100000
}
}
}
]
}
},
"_source": false,
"highlight" : {
"fields" : {
"text.raw": {}
}
}
}
Please let me know if you know why I'm not getting back a result using the field type Text with the Keyword analyzer.

AWS Elastic search : Search should be performed on all combination with given query

I'm working on AWS Elastic Search. I've come across one situation in my project where in my reports i have to search keywords like "corona virus".
But result should come with containing keywords like "Corona virus" and "corona" and "virus" and "coronavirus".
Please guide me how i should build my query DSL.
Note: Working on PHP language.
Appreciate your help.
//Amit
You need to use shingle token filter
A token filter of type shingle that constructs shingles (token
n-grams) from a token stream. In other words, it creates combinations
of tokens as a single token. For example, the sentence "please divide
this sentence into shingles" might be tokenized into shingles "please
divide", "divide this", "this sentence", "sentence into", and "into
shingles".
Mapping
PUT index91
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"shingle_filter"
]
}
},
"filter": {
"shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 3,
"output_unigrams": true,
"token_separator": ""
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
Data:
POST index91/_doc
{
"title":"corona virus"
}
Query:
GET index91/_search
{
"query": {
"match": {
"title": "coronavirus"
}
}
}
Result:
"hits" : [
{
"_index" : "index91",
"_type" : "_doc",
"_id" : "gNmUZHEBrJsHVOidaoU_",
"_score" : 0.9438393,
"_source" : {
"title" : "corona virus"
}
}
It will also work for "corona", "corona virus","virus"

How do I get only the element values that match in the list in the Elastic Search?

[Hi, there]
I want to create an ES query that only retrieves certain elements that match in the list.
Here is my ES index schema.
"test-es-2018":{
"aliases": {},
"mappings": {
"test-1": {
"properties": {
"categoryName": {
"type": "keyword",
"index": false
},
"genDate": {
"type": "date"
},
"docList": {
"properties": {
"rank": {
"type": "integer",
"index": false
},
"doc-info": {
"properties": {
"docId": {
"type": "keyword"
},
"docName": {
"type": "keyword",
"index": false
},
}
}
}
},
"categoryId": {
"type": "keyword"
},
}
}
}
}
There are documents listed in the category. Documents in the list have their own information.
*search query in Kibana.
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 2,
"doc-info": {...}
},
{
"rank": 1,
"doc-info": {...}
},
{
"rank": 5,
"doc-info": {...}
},
],
"categoryId": "201"
}
First, I want to get only the element value that match in the list.
I would like to see only documents with rank 1 in the list. However, if I query using match as below, the result is the same as *search query in kibana.
*match query in Kibana.
GET test-es-2018/_search
{
"query": {
"bool": {
"must": [
{ "match": { "docList.rank": 1 } },
]
}
}
}
In my opinion, it seems to print the entire list because it contains a document with rank one.
What I want is:
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 1,
"doc-info": {...}
},
],
"categoryId": "201"
}
Is this possible?
Second, I want to sort the docList by rank. I tried sorting by creating a query like the following, but it was not sorted.
*sort query in Kibana.
GET test-es-2018/_search?
{
"query" : {
"bool" : {...}
},
"sort" : [
{
"docList.rank" : {
"order" : "asc"
}
}
]
}
What I want is:
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 1,
"doc-info": {...}
},
{
"rank": 2,
"doc-info": {...}
},
{
"rank": 5,
"doc-info": {...}
},
],
"categoryId": "201"
}
I do not know how to access the list. Is there a good idea for both of these issues?
In general you could use source filter to retrieve only part of the document but this way it's not possible to exclude some fields based on their values.
As far as I know Elasticsearch doesn't support changing order of field values in the _source. Partly the desired result can be achieved by using nested fields along with inner_hits -> sort query expression. This way sorted subhits will be returned in the inner_hits section of the response.
P.S. Typically working with Elasticsearch you should consider indexed document as the smallest indivisible search unit.

how to handle nested lists in AWS APIG Mapping Template in VTL

(Here's my Model scheme:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "QuestionsModel",
"type": "array",
"items": {
"type": "object",
"properties": {
"section_name": { "type": "string" },
"options" : {
"type" : "array",
"items" : {
"type" : "array",
"items" : {
"type" : "string"
}
}
}
}
Here's the Mapping template:
#set($inputRoot = $input.path('$'))
[
#foreach($question in $inputRoot) {
"section_name" : "$question.section_name.S",
"options" : [
#foreach($items in $question.options.L) {
[
#foreach($item in $items.L) {
"$item.S"
}#if($foreach.hasNext),#end
#end
]
}#if($foreach.hasNext),#end
#end
]
}#if($foreach.hasNext),#end
#end
]
Although this syntax correctly maps the data it results in "options" being an empty array.
Without the "options" specified then my iOS app receives valid JSON. But when I try various syntaxes for "options" then I either get invalid JSON or an "Internal Service Error" and CloudWatch isn't much better offering Unable to transform response.
The options valid is populated with this content: {L=[{"L":[{"S":"1"},{"S":"Dr"}]},{"L":[{"S":"2"},{"S":"Mr"}]},{"L":[{"S":"3"},{"S":"Ms"}]},{"L":[{"S":"4"},{"S":"Mrs"}]},{"L":[{"S":"5"},{"S":"Prof."}]}]} which is provided by a Lambda function.
I can only conclude, at this point, that API Gateway VTL doesn't support nested arrays.
AWS iOS SDK for Modelling doesn't support array of arrays.
You have to define a dictionary in between any nested arrays.
So instead of array/object/array/array you slip in an extra "awshack" object: array/object/array/awshack-object/array
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "QuestionsModel",
"type": "array",
"items": {
"type": "object",
"properties": {
"section_name": { "type": "string" },
"options" : { "type" : "array",
"items" : {
"type" : "object",
"properties" : {
"awshack" : {
"type" : "array",
"items" : { "type" : "string" }
}
}
}
}
}
}
}
In the mapping template the "awshack" is slipped in outside the innermost loop.
#foreach($items in $question.options.L)
{"awshack" :
[#foreach($item in $items.L)
"$item.S"#if($foreach.hasNext),#end
#end
#if($foreach.hasNext),#end
]}#if($foreach.hasNext),#end
#end
Amazon confirms this limitation.

How do i define ElasticSearch Dynamic Templates?

I'm trying to define dynamic templates in Elastic Search to automatically set analysers for currently undefined properties for translations.
E.g. The following does exactly what i want, which is to set lang.en.title to use the english analyzer:
PUT /cl
{
"mappings" : {
"titles" : {
"properties" : {
"id" : {
"type" : "integer",
"index" : "not_analyzed"
},
"lang" : {
"type" : "object",
"properties" : {
"en" : {
"type" : "object",
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed",
"analyzer" : "english"
}
}
}
}
}
}
}
}
}
Which stems lang.en.title as expected e.g.
GET /cl/_analyze?field=lang.en.title&text=knocked
{
"tokens": [
{
"token": "knock",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
}
]
}
But what i'm trying to do is set all future string properties of lang.en to use the english analyser using a dynamic template which i can't seem to get working...
PUT /cl
{
"mappings" : {
"titles" : {
"dynamic_templates" : [{
"lang_en" : {
"path_match" : "lang.en.*",
"mapping" : {
"type" : "string",
"index" : "analyzed",
"analyzer" : "english"
}
}
}],`enter code here`
"properties" : {
"id" : {
"type" : "integer",
"index" : "not_analyzed"
},
"lang" : {
"type" : "object"
}
}
}
}
}
The english analyser isn't being applied as lang.en.title isn't stemmed as desired -
GET /cl/_analyze?field=lang.en.title&text=knocked
{
"tokens": [
{
"token": "knocked",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
}
]
}
What am i missing? :)
Your dynamic template is defined correctly. The issue is that you will need to index a document with the lang.en.title field in it before the dynamic template will apply the appropriate mapping. I ran the same dynamic mapping that you have defined above in your question locally and got the same result as you.
However, then I added a single document to the index.
POST /cl/titles/1
{
"lang.en.title": "Knocked out"
}
After adding the document, I ran the analyzer again and I got the expected output:
GET /cl/_analyze?field=lang.en.title&text=knocked
{
"tokens": [
{
"token": "knock",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
}
]
}
The index needs to have a document inserted so that it can execute the defined mapping template for the inserted fields. Once that field exists in the index and has the dynamic mapping applied, _analyze API calls will execute as expected.