Passing HTTP Post from AWS API GW to Lambda - mailgun

I process the HTTP POST from a service that does not support JSON (Mailgun). It appears if I create an AWS API GW for POST and pass this to an AWS Lambda function that the data must be in JSON. Other than trying to serialize the POST to JSON (which I would prefer not to), does anyone know if this is the case?

I found a solution here, works for me.
https://forums.aws.amazon.com/thread.jspa?messageID=673012&tstart=0#673012
The following is from the original post for a complete answer.
Step-by-step instructions are as follows:
Amazon API Gateway -> Click "Create API".
API name = "myTestAPI", Clone from API = Do not clone from existing API, Description = "Test"
Click "Create API".
Click "Create Resource".
Resource Name = "myTestInput", Resource Path = "mytestinput".
Click "Create Resource".
Click "Create Method".
Select "POST" or "GET" as required and click the tick.
Integration type = "Lambda function", pick region as appropriate, write code as appropriate to action / store form data.
Click "Save", click "Ok" to grant permission.
Click "Integration Request".
Click "Mapping Templates".
Click "Add mapping template".
Content-Type is "application/x-www-form-urlencoded" and click the tick.
Click "application/x-www-form-urlencoded".
Click the pencil icon next to "Input passthrough".
Select "Mapping template".
Paste the following into the Template box:
--
## convert HTML POST data or HTTP GET query string to JSON
## get the raw post data from the AWS built-in variable and give it a nicer name
#if ($context.httpMethod == "POST")
#set($rawAPIData = $input.path('$'))
#elseif ($context.httpMethod == "GET")
#set($rawAPIData = $input.params().querystring)
#set($rawAPIData = $rawAPIData.toString())
#set($rawAPIDataLength = $rawAPIData.length() - 1)
#set($rawAPIData = $rawAPIData.substring(1, $rawAPIDataLength))
#set($rawAPIData = $rawAPIData.replace(", ", "&"))
#else
#set($rawAPIData = "")
#end
## first we get the number of "&" in the string, this tells us if there is more than one key value pair
#set($countAmpersands = $rawAPIData.length() - $rawAPIData.replace("&", "").length())
## if there are no "&" at all then we have only one key value pair.
## we append an ampersand to the string so that we can tokenise it the same way as multiple kv pairs.
## the "empty" kv pair to the right of the ampersand will be ignored anyway.
#if ($countAmpersands == 0)
#set($rawPostData = $rawAPIData + "&")
#end
## now we tokenise using the ampersand(s)
#set($tokenisedAmpersand = $rawAPIData.split("&"))
## we set up a variable to hold the valid key value pairs
#set($tokenisedEquals = [])
## now we set up a loop to find the valid key value pairs, which must contain only one "="
#foreach( $kvPair in $tokenisedAmpersand )
#set($countEquals = $kvPair.length() - $kvPair.replace("=", "").length())
#if ($countEquals == 1)
#set($kvTokenised = $kvPair.split("="))
#if ($kvTokenised[0].length() > 0)
## we found a valid key value pair. add it to the list.
#set($devNull = $tokenisedEquals.add($kvPair))
#end
#end
#end
## next we set up our loop inside the output structure "{" and "}"
{
#foreach( $kvPair in $tokenisedEquals )
## finally we output the JSON for this pair and append a comma if this isn't the last pair
#set($kvTokenised = $kvPair.split("="))
"$util.urlDecode($kvTokenised[0])" : #if($kvTokenised[1].length() > 0)"$util.urlDecode($kvTokenised[1])"#{else}""#end#if( $foreach.hasNext ),#end
#end
}
Click the tick next to the "Mapping template" dropdown.
Click "<- Method Execution".
Click "Deploy API".
Deployment stage = "New stage", Stage name = "production".
Click "Deploy".

Related

Dynamically Insert/Update Item in DynamoDB With Python Lambda using event['body']

I am working on a lambda function that gets called from API Gateway and updates information in dynamoDB. I have half of this working really dynamically, and im a little stuck on updating. Here is what im working with:
dynamoDB table with a partition key of guild_id
My dummy json code im using:
{
"guild_id": "126",
"guild_name": "Posted Guild",
"guild_premium": "true",
"guild_prefix": "z!"
}
Finally the lambda code:
import json
import boto3
def lambda_handler(event, context):
client = boto3.resource("dynamodb")
table = client.Table("guildtable")
itemData = json.loads(event['body'])
guild = table.get_item(Key={'guild_id':itemData['guild_id']})
#If Guild Exists, update
if 'Item' in guild:
table.update_item(Key=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Updated Guild!')
return responseObject
#New Guild, Insert Guild
table.put_item(Item=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Inserted Guild!')
return responseObject
The insert part is working wonderfully, How would I accomplish a similar approach with update item? Im wanting this to be as dynamic as possible so I can throw any json code (within reason) at it and it stores it in the database. I am wanting my update method to take into account adding fields down the road and handling those
I get the follow error:
Lambda execution failed with status 200 due to customer function error: An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema.
A "The provided key element does not match the schema" error means something is wrong with Key (= primary key). Your schema's primary key is guild_id: string. Non-key attributes belong in the AttributeUpdate parameter. See the docs.
Your itemdata appears to include non-key attributes. Also ensure guild_id is a string "123" and not a number type 123.
goodKey={"guild_id": "123"}
table.update_item(Key=goodKey, UpdateExpression="SET ...")
The docs have a full update_item example.

AWS Kendra PreHook Lambdas for Data Enrichment

I am working on a POC using Kendra and Salesforce. The connector allows me to connect to my Salesforce Org and index knowledge articles. I have been able to set this up and it is currently working as expected.
There are a few custom fields and data points I want to bring over to help enrich the data even more. One of these is an additional answer / body that will contain key information for the searching.
This field in my data source is rich text containing HTML and is often larger than 2048 characters, a limit that seems to be imposed in a String data field within Kendra.
I came across two hooks that are built in for Pre and Post data enrichment. My thought here is that I can use the pre hook to strip HTML tags and truncate the field before it gets stored in the index.
Hook Reference: https://docs.aws.amazon.com/kendra/latest/dg/API_CustomDocumentEnrichmentConfiguration.html
Current Setup:
I have added a new field to the index called sf_answer_preview. I then mapped this field in the data source to the rich text field in the Salesforce org.
If I run this as is, it will index about 200 of the 1,000 articles and give an error that the remaining articles exceed the 2048 character limit in that field, hence why I am trying to set up the enrichment.
I set up the above enrichment on my data source. I specified a lambda to use in the pre-extraction, as well as no additional filtering, so run this on every article. I am not 100% certain what the S3 bucket is for since I am using a data source, but it appears to be needed so I have added that as well.
For my lambda, I create the following:
exports.handler = async (event) => {
// Debug
console.log(JSON.stringify(event))
// Vars
const s3Bucket = event.s3Bucket;
const s3ObjectKey = event.s3ObjectKey;
const meta = event.metadata;
// Answer
const answer = meta.attributes.find(o => o.name === 'sf_answer_preview');
// Remove HTML Tags
const removeTags = (str) => {
if ((str===null) || (str===''))
return false;
else
str = str.toString();
return str.replace( /(<([^>]+)>)/ig, '');
}
// Truncate
const truncate = (input) => input.length > 2000 ? `${input.substring(0, 2000)}...` : input;
let result = truncate(removeTags(answer.value.stringValue));
// Response
const response = {
"version" : "v0",
"s3ObjectKey": s3ObjectKey,
"metadataUpdates": [
{"name":"sf_answer_preview", "value":{"stringValue":result}}
]
}
// Debug
console.log(response)
// Response
return response
};
Based on the contract for the lambda described here, it appears pretty straight forward. I access the event, find the field in the data called sf_answer_preview (the rich text field from Salesforce) and I strip and truncate the value to 2,000 characters.
For the response, I am telling it to update that field to the new formatted answer so that it complies with the field limits.
When I log the data in the lambda, the pre-extraction event details are as follows:
{
"s3Bucket": "kendrasfdev",
"s3ObjectKey": "pre-extraction/********/22736e62-c65e-4334-af60-8c925ef62034/https://*********.my.salesforce.com/ka1d0000000wkgVAAQ",
"metadata": {
"attributes": [
{
"name": "_document_title",
"value": {
"stringValue": "What majors are under the Exploratory track of Health and Life Sciences?"
}
},
{
"name": "sf_answer_preview",
"value": {
"stringValue": "A complete list of majors affiliated with the Exploratory Health and Life Sciences track is available online. This track allows you to explore a variety of majors related to the health and life science professions. For more information, please visit the Exploratory program description. "
}
},
{
"name": "_data_source_sync_job_execution_id",
"value": {
"stringValue": "0fbfb959-7206-4151-a2b7-fce761a46241"
}
},
]
}
}
The Problem:
When this runs, I am still getting the same field limit error that the content exceeds the character limit. When I run the lambda on the raw data, it strips and truncates it as expected. I am thinking that the response in the lambda for some reason isn't setting the field value to the new content correctly and still trying to use the data directly from Salesforce, thus throwing the error.
Has anyone set up lambdas for Kendra before that might know what I am doing wrong? This seems pretty common to be able to do things like strip PII information before it gets indexed, so I must be slightly off on my setup somewhere.
Any thoughts?
since you are still passing the rich text as a metadata filed of a document, the character limit still applies so the document would fail at validation step of the API call and would not reach the enrichment step. A work around is to somehow append those rich text fields to the body of the document so that your lambda can access it there. But if those fields are auto generated for your documents from your data sources, that might not be easy.

How do I avoid putting quotes around each url parameter when using AWS API Gateway?

I am trying to utilize AWS' API Gateway to trigger a lambda function that copies a file from a source bucket to a destination bucket. I would like the form of the API call to be
https://some/api/url/my_lambda_function?key1=joe.mp4&key2=video-files&key3=edited-video-files
I set up the lambda function. I attach an API Gateway and configure the API Gateway. The problem is when I set up the integration mapping.
When I run https://some/api/url/my_lambda_function?key1="joe.mp4"&key2="video-files"&key3="edited-video-files" everything works as it should. However if I run it without the quotes around the parameters, I get an error. For example, if I remove quotes around the key3 parameter, the error is
{"message": "Could not parse request body into json: Unrecognized token \'edited\': was expecting (\'true\', \'false\' or \'null\')\n at [Source: (byte[])\"{\n \"key1\": \"joe.mp4\",\n \"key2\": \"video-files\",\n \"key3\": edited-video-files\n\n}\n\"; line: 4, column: 22]"}
Here's my setup.
Under the API Gateway->Resources->Integration Request->Mapping Templates I click the option (When there are no templates defined). I use application/json and my template is:
{
"key1": $input.params('key1'),
"key2": $input.params('key2'),
"key3": $input.params('key3')
}
For completeness, my Lambda is:
import boto3
def lambda_handler(event, context):
# initialize s3
s3 = boto3.client("s3")
# print event output
print(event)
FILENAME = event['key1']
SOURCE_BUCKET = event['key2']
DEST_BUCKET = event['key3']
# formatted copy string
copy_source = {
'Bucket': SOURCE_BUCKET,
'Key': FILENAME,
}
# copy files
s3.copy_object(Bucket=DEST_BUCKET, Key=FILENAME, CopySource=copy_source)
return 'Thanks for watching'
It seems to work if I put quotes around the value in the mapping template key-value pairs:
"key1": "$input.params('key1')",
"key2": "$input.params('key2')",
"key3": "$input.params('key3')"
}```
If you want to pass the url parameters using a key/value pair, eg key1="joe.mp4", then you must use quotes as that defines that key's string value.
However, you can also setup mappings for the URL that don't require the quotes, and are instead separated by a slash ("/") as highlighted in this example, but these aren't as flexible as the key/value setup, because they must be in a specific order.
For example, with a key/value setup you can either do http://url?key1="value1"&key2="value2"&key3="value3", or you can do http://url?key3="value3"&key1="value1"&key2="value2" and it would have the same result (note the order of the keys). However with static parameters separated by slash, you can't do this, all values must be passed in a static order, http://url/value1/value2/value3

Bounding box in AWS Rekognition

I'm trying to get bounding box from an image in Rekognition, i get the label but i get:
Keyerror'instances' in response['instances']
def detect_labels(bucket, key, max_labels=10, min_confidence=90, region="eu-west-1"):
rekognition = session.client("rekognition", region)
response = rekognition.detect_labels(
Image={
"S3Object": {
"Bucket": bucket,
"Name": key,
}
}, MaxLabels=10
)
return response
if __name__ == "__main__":
response= detect_labels(BUCKET, KEY)
print('Detected labels for ' + photo)
print()
for label in response['Labels']:
for instance in label['Instances']:
print (" Bounding box")
print (" Top: " + str(instance['BoundingBox']['Top']))
print ("----------")
print ()
Please be sure that you are using an up to date boto3 SDK. I have found that boto3 v1.9.20 does not return the instances array, while the current v1.9.84 does return it.
That aside, the documentation states:
If Label represents an object, Instances contains the bounding boxes
for each instance ...
That seems to imply that instances will only be present if the label represents an object. Your code should check that a given label actually has instances, for example:
if 'Instances' in label:
for instance in label['Instances']:
# print details of instance
It would also be simple to confirm this by simply printing the label dict as a JSON string and seeing what it actually contains.

Pubnub functions not working on AWS Lambda

I'm trying to use history method provided by Pubnub to get the chat history of a channel and running my node.js code on AWS Lambda. However, my function is not getting called. I'm not sure if I'm doing it correctly, but here's the code snippet-
var publishKey = "pub-c-cfe10ea4-redacted";
var subscribeKey = "sub-c-fedec8ba-redacted";
var channelId = "ChatRoomDemo";
var uuid;
var pubnub = {};
function readMessages(intent,session,callback){
pubnub = require("pubnub")({
publish_key : publishKey,
subscribe_key: subscribeKey
});
pubnub.history({
channel : 'ChatRoomDemo',
callback : function(m){
console.log(JSON.stringify(m));
},
count : 100,
reverse : false
});
}
I expect the message history in JSON format to be displayed on the console.
I had the same problem and finally got it working. What you will need to do is allow the CIDR address for pubnub.com. This was a foreign idea to me until I figured it out! Here's how to do that to publish to a channel:
Copy the CIDR address for pubnub.com which is 54.246.196.128/26 (Source) [WARNING: do not this - see comment below]
Log into https://console.aws.amazon.com
Under "Services" go to "VPC"
On the left, under "Security," click "Network ACLs"
Click "Create Network ACL" give it a name tag like "pubnub.com"
Select the VPC for your Lambda skill (if you're not sure, click around your Lambda function, you'll see it. You probably only have one listed like me)
Click "Yes, Create"
Under the "Outbound Rules" tab, click "Edit"
For "Rule #" I just used "1"
For "Type" I used "HTTP (80)"
For "Destination" I pasted in the CIDR from step 1
"Save"
Note, if you're subscribing to a channel, you'll also need to add an "Inbound Rule" too.