AWS API Gateway only allows the first element of the form data and ignores the rest - amazon-web-services

I have been trying to push data into AWS SQS using AWS API Gateway, the data I send is in the form of application/x-www-form-urlencoded.
And it looks somewhat like this:
fruits[]: apple
fruits[]: mango
fruits[]: banana
season: summer
Now when I poll the data from AWS SQS, I see only fruits[]=apple has been stored and all others are ignored.
This is my current mapping template to push in SQS:
Action=SendMessage&MessageBody=$input.body
Looks like it has multiple $input.body but if that is the case then its kinda impossible to capture random data coming in.
I am new to AWS API Gateway, thanks in advance. :D

After a lot of research and stuff, I was able to decipher this mystery.
the value of $input.body is:
fruits[]=apple&fruits[]=mango&fruits[]=banana&season=summer
Now only MessageBody is pushed in SQS, so according to my template, the resulting query string which was forming, was:
Action=SendMessage&MessageBody=fruits[]=apple&fruits[]=mango&fruits[]=banana&season=summer
only fruits[]=apple is falling under MessageBody and all other becomes separate query objects and hence were ignored.
I just had to tweak the template to:
Action=SendMessage&MessageBody=$util.urlEncode($input.body)
So the resulting query string does not include any more & or = and every thing falls under MessageBody
Edits are welcomed

try this
Request:
POST apigateway/stage/resource?query=test
{
"season": "summer",
"list": [apple,mango,banana]
}
Mapping:
#set($inputRoot = $input.path('$'))
{
"query": "$input.params('query')",
"id": "$inputRoot.season",
"list": $inputRoot.list
}
https://aws.amazon.com/blogs/compute/using-api-gateway-mapping-templates-to-handle-changes-in-your-back-end-apis/
https://docs.aws.amazon.com/apigateway/latest/developerguide/example-photos.html

Related

Create a cloud scheduler job in CLI with nested json request body

I am running into trouble trying to schedule a Workflow using Google Cloud Scheduler through the Google CLI.
In particular my workflow requires the following request body:
{
"arg1": "some_string",
"arg2": "some_other_string",
"arg3": [
{
"foo": "foo1",
"bar": "bar1"
},
{
"foo": "foo2",
"bar": "bar2"
}
]
}
In a different workflow with request body consisting only of arg1 and arg2 I was able to schedule a cloud function using the double-escaped json string format:
gcloud scheduler jobs create http <NAME> --schedule=<CRON> --uri=<URI> --message-body="{\"argument\": \"{\\\"arg1\\\":\\\"some_string\\\",\\\"arg2\\\":\\\"some_other_string\\\"}\"}" --time-zone="UTC"
With the above request body I am unclear how to do this, I tried setting the message-body as
"{\"argument\": \"{\\\"arg1\\\":\\\"some_string\\\",\\\"arg2\\\":\\\"some_other_string\\\",\\\"arg3\\\":\\\"[{\\\\\"foo\\\\\":\\\\\"foo1\\\\\",\\\\\"bar\\\\\":\\\\\"bar1\\\\\"}]\\\"}\"}"
But it didn't seem to like this and threw an "INVALID ARGUMENT" status. I've also tried a few other variations such as without quotes around the list brackets but haven't had any success.
Apologies for how ugly these strings are. Is anyone aware how to format them correctly, or better yet, a simplified way of entering the request body in the command?
Thanks in advance.
Edit: I have tried using the --message-body-from-file argument as mentioned in the comments by #john-hanley. I found it still required escape quotes to work on my simple case.
body.json
{"argument": "{\"arg1\":\"some_string\",\"arg2\":\"some_other_string\"}"}
When I tried the nested case however with no quotes around the list it did work!
body.json
{"argument": "{\"arg1\":\"some_string\",\"arg2\":\"some_other_string\", \"arg3\": [{\"foo\": \"foo1\", \"bar\": \"bar1\"},{\"foo\":\"foo2\", \"bar\": \"bar2\"}]"}
Solved by incorporating comments by #JohnHanley and unquoting the repeated field
Command:
gcloud scheduler jobs create http <NAME> --schedule=<CRON> --uri=<URI> --message-body-from-file="body.json" --time-zone="UTC"
body.json
{"argument": "{\"arg1\":\"some_string\",\"arg2\":\"some_other_string\", \"arg3\": [{\"foo\": \"foo1\", \"bar\": \"bar1\"},{\"foo\":\"foo2\", \"bar\": \"bar2\"}]"}

AWS Kendra PreHook Lambdas for Data Enrichment

I am working on a POC using Kendra and Salesforce. The connector allows me to connect to my Salesforce Org and index knowledge articles. I have been able to set this up and it is currently working as expected.
There are a few custom fields and data points I want to bring over to help enrich the data even more. One of these is an additional answer / body that will contain key information for the searching.
This field in my data source is rich text containing HTML and is often larger than 2048 characters, a limit that seems to be imposed in a String data field within Kendra.
I came across two hooks that are built in for Pre and Post data enrichment. My thought here is that I can use the pre hook to strip HTML tags and truncate the field before it gets stored in the index.
Hook Reference: https://docs.aws.amazon.com/kendra/latest/dg/API_CustomDocumentEnrichmentConfiguration.html
Current Setup:
I have added a new field to the index called sf_answer_preview. I then mapped this field in the data source to the rich text field in the Salesforce org.
If I run this as is, it will index about 200 of the 1,000 articles and give an error that the remaining articles exceed the 2048 character limit in that field, hence why I am trying to set up the enrichment.
I set up the above enrichment on my data source. I specified a lambda to use in the pre-extraction, as well as no additional filtering, so run this on every article. I am not 100% certain what the S3 bucket is for since I am using a data source, but it appears to be needed so I have added that as well.
For my lambda, I create the following:
exports.handler = async (event) => {
// Debug
console.log(JSON.stringify(event))
// Vars
const s3Bucket = event.s3Bucket;
const s3ObjectKey = event.s3ObjectKey;
const meta = event.metadata;
// Answer
const answer = meta.attributes.find(o => o.name === 'sf_answer_preview');
// Remove HTML Tags
const removeTags = (str) => {
if ((str===null) || (str===''))
return false;
else
str = str.toString();
return str.replace( /(<([^>]+)>)/ig, '');
}
// Truncate
const truncate = (input) => input.length > 2000 ? `${input.substring(0, 2000)}...` : input;
let result = truncate(removeTags(answer.value.stringValue));
// Response
const response = {
"version" : "v0",
"s3ObjectKey": s3ObjectKey,
"metadataUpdates": [
{"name":"sf_answer_preview", "value":{"stringValue":result}}
]
}
// Debug
console.log(response)
// Response
return response
};
Based on the contract for the lambda described here, it appears pretty straight forward. I access the event, find the field in the data called sf_answer_preview (the rich text field from Salesforce) and I strip and truncate the value to 2,000 characters.
For the response, I am telling it to update that field to the new formatted answer so that it complies with the field limits.
When I log the data in the lambda, the pre-extraction event details are as follows:
{
"s3Bucket": "kendrasfdev",
"s3ObjectKey": "pre-extraction/********/22736e62-c65e-4334-af60-8c925ef62034/https://*********.my.salesforce.com/ka1d0000000wkgVAAQ",
"metadata": {
"attributes": [
{
"name": "_document_title",
"value": {
"stringValue": "What majors are under the Exploratory track of Health and Life Sciences?"
}
},
{
"name": "sf_answer_preview",
"value": {
"stringValue": "A complete list of majors affiliated with the Exploratory Health and Life Sciences track is available online. This track allows you to explore a variety of majors related to the health and life science professions. For more information, please visit the Exploratory program description. "
}
},
{
"name": "_data_source_sync_job_execution_id",
"value": {
"stringValue": "0fbfb959-7206-4151-a2b7-fce761a46241"
}
},
]
}
}
The Problem:
When this runs, I am still getting the same field limit error that the content exceeds the character limit. When I run the lambda on the raw data, it strips and truncates it as expected. I am thinking that the response in the lambda for some reason isn't setting the field value to the new content correctly and still trying to use the data directly from Salesforce, thus throwing the error.
Has anyone set up lambdas for Kendra before that might know what I am doing wrong? This seems pretty common to be able to do things like strip PII information before it gets indexed, so I must be slightly off on my setup somewhere.
Any thoughts?
since you are still passing the rich text as a metadata filed of a document, the character limit still applies so the document would fail at validation step of the API call and would not reach the enrichment step. A work around is to somehow append those rich text fields to the body of the document so that your lambda can access it there. But if those fields are auto generated for your documents from your data sources, that might not be easy.

Can I create Slack subscriptions to an AWS SNS topic?

I'm trying to create a SNS topic in AWS and subscribe a lambda function to it that will send notifications to Slack apps/users.
I did read this article -
https://aws.amazon.com/premiumsupport/knowledge-center/sns-lambda-webhooks-chime-slack-teams/
that describes how to do it using this lambda code:
#!/usr/bin/python3.6
import urllib3
import json
http = urllib3.PoolManager()
def lambda_handler(event, context):
url = "https://hooks.slack.com/services/xxxxxxx"
msg = {
"channel": "#CHANNEL_NAME",
"username": "WEBHOOK_USERNAME",
"text": event['Records'][0]['Sns']['Message'],
"icon_emoji": ""
}
encoded_msg = json.dumps(msg).encode('utf-8')
resp = http.request('POST',url, body=encoded_msg)
print({
"message": event['Records'][0]['Sns']['Message'],
"status_code": resp.status,
"response": resp.data
})
but the problem is, that in that implementation I have to create a lambda function for every user.
I want to subscribe multiple Slack apps/users to one SNS topic.
Is there a way of doing that without creating a lambda function for each one?
You really DON'T need Lambda. Just SNS and SLACK are enough.
I found a way to integrate AWS SNS with slack WITHOUT AWS Lambda or AWS chatbot. With this approach you can confirm the subscription easily.
Follow the video which show all the step clearly.
https://www.youtube.com/watch?v=CszzQcPAqNM
Steps to follow:
Create slack channel or use existing channel
Create a work flow with selecting Webhook
Create a variable name as "SubscribeURL". The name
is very important
Add the above variable in the message body of the
workflow Publish the workflow and get the url
Add the above Url as subscription of the SNS You will see the subscription URL in the
slack channel
Follow the URl and complete the subscription
Come back to the work flow and change the "SubscribeURL" variable to "Message"
The publish the
message in SNS. you will see the message in the slack channel.
Hi i would say you should go for a for loop and make a list of all the users. Either manually state them in the lambda or get them with api call from slack e.g. this one here: https://api.slack.com/methods/users.list
#!/usr/bin/python3.6
import urllib3
import json
http = urllib3.PoolManager()
def lambda_handler(event, context):
userlist = ["name1", "name2"]
for user in userlist:
url = "https://hooks.slack.com/services/xxxxxxx"
msg = {
"channel": "#" + user, # not sure if the hash has to be here
"username": "WEBHOOK_USERNAME",
"text": event['Records'][0]['Sns']['Message'],
"icon_emoji": ""
}
encoded_msg = json.dumps(msg).encode('utf-8')
resp = http.request('POST',url, body=encoded_msg)
print({
"message": event['Records'][0]['Sns']['Message'],
"status_code": resp.status,
"response": resp.data
})
Another solution you can do is set up email for the slack users, see link:
https://slack.com/help/articles/206819278-Send-emails-to-Slack
When you can just add the emails as subscribers to the sns topic. You can fileter the msg that the receiver gets with Subscription filter policy.

Regex filtering of messages in SNS

Is there a way to filter messages based on Regex or substring in AWS SNS?
AWS Documentation for filtering messages mentions three types of filtering for strings:
Exact matching (whitelisting)
Anything-but matching (blacklisting)
Prefix matching
I want to filter out messages based on substrings in the messages, for example
I have a S3 event that sends a message to SNS when a new object is added to S3, the contents of the message are as below:
{
"Records": [
{
"s3": {
"bucket": {
"name": "images-bucket"
},
"object": {
"key": "some-key/more-key/filteringText/additionaldata.png"
}
}
}
]
}
I want to keep the messages if only filteringText is present in key field.
Note: The entire message is sent as text by S3 notification service, so Records is not a json object but string.
From what I've seen in the documentation, you can't do regex matches or substrings, but you can match prefixes and create your own attributes in the MessageAttributes field.
To do this, I send the S3 event to a simple Lambda that adds MessageAttributes and then sends to SNS.
In effect, S3 -> Lambda -> SNS -> other consumers (with filtering).
The Lambda can do something like this (where you'll have to programmatically decide when to add the attribute):
let messageAttributes = {
myfilterkey: {DataType: "String", StringValue:"filteringText"}
};
let params = {
Message: JSON.stringify(payload),
MessageAttributes: messageAttributes,
MessageStructure: 'json',
TargetArn: SNS_ARN
};
await sns.publish(params).promise();
Then in SNS you can filter:
{"myfilterkey": ["filtertext"]}
It seems a little convoluted to put the Lambda in there, but I like the idea of being able to plug and unplug consumers from SNS on the fly and use filtering to determine who gets what.

AWS API gateway response body template mapping (foreach)

I am trying to save data in S3 through firehose proxied by API gateway. I have create an API gateway endpoint that uses the AWS service integration type and PutRecord action for firehose. I have the mapping template as
{
"DeliveryStreamName": "test-stream",
"Records": [
#foreach($elem in $input.path('$.data'))
{
"Data": "$elem"
}
#if($foreach.hasNext),#end
#end
]
}
Now when I test the endpoint with below JSON
{
"data": [
{"ticker_symbol":"DemoAPIGTWY","sector":"FINANCIAL","change":-0.42,"price":50.43},{"ticker_symbol":"DemoAPIGTWY","sector":"FINANCIAL","change":-0.42,"price":50.43}
]
}
JSON gets modified and shows up as below after the transformation
{ticker_symbol=DemoAPIGTWY, sector=FINANCIAL, change=-0.42, price=50.43}
: is being converted to = which is not a valid JSON
Not sure if something is wrong in the above mapping template
The problem is, that $input.path() returns a json object and not a stringified version of the json. You can take a look at the documentation here.
The Data property expects the value to be a string and not a json object. So long story short - currently there is no built in function which can revert a json object into its stringified version. This means you need to re read the current element in the loop via $input.json(). This will return a json string representation of the element, which you then can add as Data.
Take a look at the answer here which illustrates this concept.
In your case, applying the concept described in the link above would result in a mapping like this:
{
"DeliveryStreamName": "test-stream",
"Records": [
#foreach($elem in $input.path('$.data'))
{
#set($json = $input.json("$[$foreach.index]"))
"Data":"$util.base64Encode($json)",
}
#if($foreach.hasNext),#end
#end
]
}
API Gateway considers the payload data as a text and not as a Json unless explicitly specified.
Kinesis also expects data to be in encoded format while proxying through API Gateway.
Try the following code and this should work, wondering why the for loop has been commented in the mapping template.
Assuming you are not looping through the record set, the following solution should work for you.
{
"DeliveryStreamName": "test-stream",
"Record": {
"Data": "$util.base64Encode($input.json('$.Data'))Cg=="
}
}
Thanks & Regards,
Srivignesh KN