How to send an event to all running instances of a AWS Lambda function - amazon-web-services

I have a Lambda function whose configuration is located in an external source (say S3).
In order to save some computing time, the function loads and keeps in memory the configuration at first execution.
I want to be able to tell all running instances of this Lambda function to reload the configuration on demand.
In the snippet below, the Lambda function reloads the configuration when the event has an attribute 'reload'. But obviously only one of the running instances gets the event.
How can I send the event to all running instances?
Is there a way to broadcast an event?
Is there another way to tackle the problem?
// Function to process events
var processEvent = function(event,callback) {
// process event
}
// Function to update configuration
var config; // global object representing configuration
var updateConfiguration = function(callback) {
// ... load asynchronously from S3
}
// Handle events
exports.handler = function(event, context, callback) {
if (event.reload) {
updateConfiguration(callback); // load configuration
} else if (!config) {
updateConfiguration(function() { // load configuration
processEvent(event,callback); // ... then process event
});
} else {
processEvent(event,callback); // process event directly
}
}

There is no capability to send messages to AWS Lambda containers. When they are not running, they are effectively "frozen" with no compute happening.
The closest option would be to have the handler check somewhere (parameter store?) whenever a function is invoked, to see whether it should reload.

Maybe you could subscribe to an SNS-Topic and send the event through this topic. The function inside your lambda that subscribes to this topic can then update the config variable.
Keep in mind that lambda functions have some special caching behaviour regarding global variables. You can read more on this topic here: https://medium.com/tensult/aws-lambda-function-issues-with-global-variables-eb5785d4b876

Related

Lambda sending messages to SQS via Event Invoke, how to send an array in chunks?

I have a terraform setup for AWS that goes something like this:
Lambda #1 -> SQS -> Lambda #2
I can use the terraform resource aws_lambda_function_event_invoke_config to handle the first part:
resource "aws_lambda_function_event_invoke_config" "lambda_1_to_sns" {
function_name = aws_lambda_function.lambda_1.function_name
destination_config {
on_success {
destination = aws_sns_topic.name_of_sqs.arn
}
}
}
Lambda #1 will have an array of messages that it wants to send, I can use the AWS SDK to send each item in the array individually via business logic but it would be easier to return the entire array as-is and have the infrastructure handle splitting it up. Is this possible?
Say I have an array [1,2,3], I want to return this at the end of my function, have it sent to SQS and somehow configure aws_lambda_function_event_invoke_config (or use the infrastructure any other way) to split it up into 3 separate messages.

SQS Lambda Trigger with Visibility Timeout extension

I'm working on a solution where I have a SQS queue with Lambda trigger. My understanding is Lambda will receive messages in batches to be processed, and once Lambda function is successful, the messages in the SQS queue is automatically deleted. However, how do I only allow some of those messages to be deleted?
Let's assume this use case:
Lambda function receives a batch with 10 messages, and only 7 messages are valid and can be processed, and the other 3 messages needs to be reprocessed at later point.
My initial thought was I could update the visibility timeout via boto3.sqs.change_visibility_timeout for each of the 3 messages to have it reprocessed after the timeout, however, since overall lambda function execution is successful, all 10 messages are deleted from SQS queue.
Any suggestions?
Yes, by default, the Lambda function deletes all the messages upon success. You would need to handle this in your code, but not by changing the visibility timeout of the messages.
Add DLQ (dead-letter queue) that will actually handle the failed messages (messages go to DLQ after a certain number of failed attempts to be processed, depending on how you set it up)
You have few options here:
You can handle each item yourself, and delete messages that are processed successfully. In case of a message that's not successful, you can throw an error and it won't be deleted automatically by the lambda function
If you use JavaScript you can try with Middy
If you use Python, you can use Lambda Powertools Python
For AWS Lambdas with an SQS trigger, by default, when your function encounters an error processing one or more messages in a given batch, the entire batch is marked as a failure. All of the messages in the batch are made visible again in the queue. Depending on your redrive policy, you can end up repeatedly processing successful messages along with the failures.
Rather than change the visibility timeout, the simplest way to specify which messages should be retried later and which can safely be deleted from the queue is to change the function response type to ReportBatchItemFailures. This allows you to return a list of failed message ids, indicating that only those messages in the batch should be made visible again in the queue.
Here's what the reporting syntax looks like for a handler function in Node.js:
exports.handler = async (event) => {
// Process the event
const batchItemFailureResponse = {
batchItemFailures: [
{
itemIdentifier: "idFailedMessage1"
},
{
itemIdentifier: "idFailedMessage2"
},
{
itemIdentifier: "idFailedMessage3"
}
]
};
return batchItemFailureResponse;
};
There is more information to be found in the official documentation.
This response type is configured when setting a queue as an event source for the Lambda. If you're configuring from the console, navigate to the Lambda function page, select the Configuration tab, and then choose Triggers. Then choose Add Trigger and choose the SQS trigger type. In addition to providing the standard parameters, be sure to check the box under Report batch item failures after expanding Additional Settings. It should look something like this:
Add trigger with batch failure reporting
This parameter must be set when first creating the trigger.
This response type can also be defined if you use CloudFormation templates to provision your resources. See the AWS documentation for more information. Note that if you use AWS SAM event source mappings, the documentation suggests that adding FunctionResponseTypes to the YAML with ReportBatchItemFailures in the type list isn't supported. That is incorrect, the documentation is simply outdated. There is an open issue around addressing this oversight.
Finally, in addition to reporting batch item failures, you should provision a target DLQ (dead-letter queue) and determine a reasonable maximum receive count so that action can be taken on messages that fail repeatedly.

Can a lambda that uses vendia serverless-express be a step in a state machine?

I have a lambda setup that uses vendia serveless-express as a handler. It has been working well to serve REST APIs from the single lambda function with multiple routes.
I now have a new requirement, where the same lambda needs to be part of a step function's state machine. But that doesn't seem to be allowed by vendia app, as it always throws the error: "Unable to determine event source based on event" as it expects the event to be api gateway / alb only.
So, based on this, it looks like I will need a separate lambda for step, which makes me have duplicate code in multiple lambdas.
Is it possible for the lambda to handle the inputs from step and still be a vendia express app? Please let me know if I am trying something that doesn't make sense at all.
If I were you I would just implement my own converter that transforms a StepFunction event into an API Gateway event and then calling the express-serverless in your lambda.
The package aws-lambda contains definitions in TypeScript of many AWS events including those ones, then try to generate a mock API Gateway event from your own step function event value. From the sources (4.3.9) we have the function:
function getEventSourceNameBasedOnEvent ({
event
}) {
if (event.requestContext && event.requestContext.elb) return 'AWS_ALB'
if (event.Records) return 'AWS_LAMBDA_EDGE'
if (event.requestContext) {
return event.version === '2.0' ? 'AWS_API_GATEWAY_V2' : 'AWS_API_GATEWAY_V1'
}
throw new Error('Unable to determine event source based on event.')
}
So probably to make it work correctly, you have to define a RequestContext mock value in your mock event and it should be enough.

AWS Lambda Function invoked by SQS trigger is not respecting the visibility timeout I'm manually setting within the function

I'm implementing my own webhooks service which will send out events to subscribed webhooks.
Overview of architecture:
events are pushed onto an SQS queue
a lambda function is triggered by SQS messages (event source mapping)
for each event, I make outgoing http requests to subscribed webhooks
non-2xx responses must be retried with exponential backoff (in such event, I change the message visibility on the received message)
since lambdas that are invoked by SQS will automatically delete the message upon completion I throw an error at the end of the function to prevent the automatic delete
As far as I can tell, the call to change the message visibility is succeeding. I'm wondering if there's something else baked into lambdas that are invoked by SQS. Upon failure from the lambda, is it internally changing the message visibility again? Or do lambdas that are invoked by SQS not respect message visibility changes (this really doesn't make any sense to me). Curious if anyone has any insight into this problem. I was quite surprised to find out that lambda automatically deletes messages upon success since it makes my particular use-case a little clunky feeling - throwing an error to fail the lambda function to prevent the message from being deleted.
Thanks in advance!
The nature of the SQS integration with Lambda is that the integration controls the polling of the messages. The mechanism used to determine if the message should be deleted is that the response from the lambda is not an error. It's not clearly stated in the documentation, but I believe when an error occurs the integration is going to set the visibility timeout to zero in order to make it immediately available for another process to pick it up. So in your example you are setting it to some number that allows you to retry, but when you return an error the integration is setting the timeout back to zero. If you need to have more control over that process you probably should not use the integration.
Update: This is actually not the case. I was failing to properly wait for the call to adjust the timeout to finish. As such the lambda was closing before that request was completing. The timeout I'm setting on the message within the lambda is being respected. I'm then throwing an error to prevent the message from being deleted.
Since SQS triggers for lambdas process messages in batches, using await in a standard forEach doesn't work (forEach does not await the callbacks if they are async). To bypass that, you can create your own version of an async forEach:
var AWS = require('aws-sdk');
exports.handler = async function(event, context) {
var sqs = new AWS.SQS({apiVersion: '2012-11-05'});
await asyncForEach(event.Records, async record => {
const { receiptHandle } = record;
const sqsParams = {
QueueUrl: '<YOUR_QUEUE_URL>', /* required */
ReceiptHandle: receiptHandle, /* required */
VisibilityTimeout: '<in seconds>' /* required */
};
try {
let res = await sqs.changeMessageVisibility(sqsParams).promise();
console.log(res);
} catch (err) {
console.log(err, err.stack);
throw new Error('Fail.');
}
}
});
return {};
};
async function asyncForEach(array, callback) {
for (let index = 0; index < array.length; index++) {
await callback(array[index], index, array);
}
}

Api Gateway: AWS Subdomain for Lambda Integration

I'm attempting to integrate my lambda function, which must run async because it takes too long, with API gateway. I believe I must, instead of choosing the "Lambda" integration type, choose "AWS Service" and specify Lambda. (e.g. this and this seem to imply that.)
However, I get the message "AWS ARN for integration must contain path or action" when I attempt to set the AWS Subdomain to the ARN of my Lambda function. If I set the subdomain to just the name of my Lambda function, when attempting to deploy I get "AWS ARN for integration contains invalid path".
What is the proper AWS Subdomain for this type of integration?
Note that I could also take the advice of this post and set up a Kinesis stream, but that seems excessive for my simple use case. If that's the proper way to resolve my problem, happy to try that.
Edit: Included screen shot
Edit: Please see comment below for an incomplete resolution.
So it's pretty annoying to set up, but here are two ways:
Set up a regular Lambda integration and then add the InvocationType header described here http://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html. The value should be 'Event'.
This is annoying because the console won't let you add headers when you have a Lambda function as the Integration type. You'll have to use the SDK or the CLI, or use Swagger where you can add the header easily.
Set the whole thing up as an AWS integration in the console (this is what you're doing in the question), just so you can set the InvocationType header in the console
Leave subdomain blank
"Use path override" and set it to /2015-03-31/functions/<FunctionARN>/invocations where <FunctionARN> is the full ARN of your lambda function
HTTP method is POST
Add a static header X-Amz-Invocation-Type with value 'Event'
http://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html
The other option, which I did, was to still use the Lambda configuration and use two lambdas. The first (code below) runs in under a second and returns immediately. But, what it really does is fire off a second lambda (your primary one) that can be long running (up to the 15 minute limit) as an Event. I found this more straightforward.
/**
* Note: Step Functions, which are called out in many answers online, do NOT actually work in this case. The reason
* being that if you use Sequential or even Parallel steps they both require everything to complete before a response
* is sent. That means that this one will execute quickly but Step Functions will still wait on the other one to
* complete, thus defeating the purpose.
*
* #param {Object} event The Event from Lambda
*/
exports.handler = async (event) => {
let params = {
FunctionName: "<YOUR FUNCTION NAME OR ARN>",
InvocationType: "Event", // <--- This is KEY as it tells Lambda to start execution but immediately return / not wait.
Payload: JSON.stringify( event )
};
// we have to wait for it to at least be submitted. Otherwise Lambda runs too fast and will return before
// the Lambda can be submitted to the backend queue for execution
await new Promise((resolve, reject) => {
Lambda.invoke(params, function(err, data) {
if (err) {
reject(err, err.stack);
}
else {
resolve('Lambda invoked: '+data) ;
}
});
});
// Always return 200 not matter what
return {
statusCode : 200,
body: "Event Handled"
};
};