How to identify the first execution of a Lambda version at runtime? - amazon-web-services

I want to run some code only on the first execution of a Lambda version. (NB: I'm not referring to a cold start scenario as this will occur more than once).
computeInstanceInvocationCount is unfortunately reset to 0 upon every cold start.
functionVersion is an available property but unless I store this in memory outside the lambda I cannot calculate if it is indeed the first execution.
Is it possible to deduce this based on runtime values in event or context? Or is there any other way?

There is no way of knowing if this is the first time that a Lambda has ever run from any information passed into the Lambda.
You would have to include functionality to check elsewhere by setting a flag or parameter there, remember though that multiple copies of the Lambda could be invoked at the same time so any data store for this would presumably need to be transactional to ensure that it occurs only once.

One way that you can try is to use AWS parameter store.
On every deployment update the parameter store value with
{"version":"latest","is_firsttime":true}
So run the below command after deployment
aws secretsmanager update-secret --secret-id demo --secret-string '{"version":"latest","is_firsttime":true}'
So this is something that we need to make sure before deployment.
Now we can set logic inside the lambda, in the demo we will look into is_firsttime only.
var AWS = require('aws-sdk'),
region = "us-west-2",
secretName = "demo",
secret,
decodedBinarySecret;
var client = new AWS.SecretsManager({
region: region
});
client.getSecretValue({SecretId: secretName}, function(err, data) {
secret = data.SecretString;
secret=JSON.parse(secret)
if ( secret.is_firsttime == true)
{
console.log("lambda is running first time")
// any init operation here
// init completed, now we are good to set back it `is_firsttime` to false
var params = {
Description: "Init completeed, updating value at date or anythign",
SecretId: "demo",
SecretString : '[{ "version" : "latest", "is_firsttime": false}]'
};
client.updateSecret(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
}
else{
console.log("init already completed");
// rest of logic incase of not first time
}
})
This is just a demo code that will work in a non-lambda environment, adjust it accordingly.
Expected response for first time
{
ARN: 'arn:aws:secretsmanager:us-west-2:12345:secret:demo-0Nlyli',
Name: 'demo',
VersionId: '3ae6623a-1111-4a41-88e5-12345'
}
Second time
init already completed

Related

GCP Cloud Tasks: shorten period for creating a previously created named task

We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.

AWS RDSDataService query not running

I'm trying to use RDSDataService to query an Aurora Serverless database. When I'm trying to query, my lambda just times out (I've set it up to 5 minutes just to make sure it isn't a problem with that). I have 1 record in my database and when I try to query it, I get no results, and neither the error or data flows are called. I've verified executeSql is called by removing the dbClusterOrInstanceArn from my params and it throw the exception for not having it.
I have also run SHOW FULL PROCESSLIST in the query editor to see if the queries were still running and they are not. I've given the lambda both the AmazonRDSFullAccess and AmazonRDSDataFullAccess policies without any luck either. You can see by the code below, i've already tried what was recommended in issue #2376.
Not that this should matter, but this lambda is triggered by a Kinesis event trigger.
const AWS = require('aws-sdk');
exports.handler = (event, context, callback) => {
const RDS = new AWS.RDSDataService({apiVersion: '2018-08-01', region: 'us-east-1'})
for (record of event.Records) {
const payload = JSON.parse(new Buffer(record.kinesis.data, 'base64').toString('utf-8'));
const data = compileItem(payload);
const params = {
awsSecretStoreArn: 'arn:aws:secretsmanager:us-east-1:149070771508:secret:xxxxxxxxx,
dbClusterOrInstanceArn: 'arn:aws:rds:us-east-1:149070771508:cluster:xxxxxxxxx',
sqlStatements: `select * from MY_DATABASE.MY_TABLE`
// database: 'MY_DATABASE'
}
console.log('calling executeSql');
RDS.executeSql(params, (error, data) => {
if (error) {
console.log('error', error)
callback(error, null);
} else {
console.log('data', data);
callback(null, { success: true })
}
});
}
}
EDIT: We've run the command through the aws cli and it returns results.
EDIT 2: I'm able to connect to it using the mysql2 package and connecting to it through the URI, so it's defiantly an issue with either the aws-sdk or how I'm using it.
Nodejs excution is not waiting for the result that's why process exit before completing the request.
use mysql library https://www.npmjs.com/package/serverless-mysql
OR
use context.callbackWaitsForEmptyEventLoop =false
Problem was the RDS had to be crated in a VPC, in which the Lambda's were not in

Aws Pass Value from Lambda Trigger to Step Function

When using a Lambda Function to trigger a Step Function, how do you get the output of the function that triggered it?
Ok, so if you want to pass an input to a Step Function execution (or, more exactly, your 'State Machine''s execution), you just need to set said input the input property when calling StartExecution (see AWS Documentation: Start Execution)
In your case, it would most likely be your lambda's last step before calling it's callback.
If it's a node js lambda, that's what it would look like
const AWS = require("aws-sdk");
const stepfunctions = new AWS.StepFunctions();
exports.myHandler = function(event, context, callback) {
... your function's code
const params = {
stateMachineArn: 'YOUR_STATE_MACHINE_ARN', /* required */
input: 'STRINGIFIED INPUT',
name: 'AN EXECUTION NAME (such as an uuid or whatever)'
};
stepfunctions.startExecution(params, function(err, data) {
if (err) callback(err); // an error occurred
else callback(null, "some success message"); // successful response
});
}
Alternatively, if your payload is too big, you could store the data in S3 or DynamoDB and pass the reference to it as your State Machine's execution's input.

Grabbing items from an S3 bucket

So I want to get a list of all the objects in my S3 bucket. I've just put it in an express route application I quickly setup (doesn't really matter it's in express just what i'm comfortable with).
So i'm doing :
var allObjs = [];
s3.listObjects({Bucket: 'myBucket'}, function(err, data) {
var stringifiedObjs = JSON.stringify(allObjs);
fs.writeFile("test", stringifiedObjs, function(err) {})
}
Which grabs my objects, stringifys them and writes them to a file called test. The issue i'm having is that it's only getting 1,000 results.
I read somewhere (I can't find where) that AWS limits you to 1,000 results per call.
How can I rerun this and grab the next 1,000? But so make sure that it's the next incremented 1,000 not still the first one?
In short, how can I get every object in my S3 bucket? I've been getting lost in the documentation.
Thank you!
EDIT
This is my object I get back :
{ Key: 'bucket_path/e11_19_9a31mv3ot51tm384grjd6rdj51boxx_q_q112.png',
LastModified: Sat Apr 23 2016 09:16:23 GMT+0100 (BST),
ETag: '"7d50fsdfsd4sda159b32cf85c683c5924"',
Size: 704222,
StorageClass: 'STANDARD',
Owner:
{ DisplayName: 'servers',
ID: '58af203151c51eddf2sdfs411e0b91d274a8fda23f58280f9b06371e436f7' } },
You need to set the marker property as the last element of the previous get
check the documentation as reference (as you already did :) )
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html
When you receive your response from the listObjects call, your response should include 2 very important fields in the data property:
IsTruncated - True if there are more keys to return. False otherwise.
NextMarker - The value to use for the Marker property in the next call to listObjects.
So after you call listObjects, you need to check the IsTruncated field to see if it's True. If it is, then feed the value from NextMarker into the value for Marker and call listObjects again.
Update:
It would appear that AWS.Request object has an .eachPage method which can be used to automatically make multiple calls. So there is a magical function to do this work for you.
var pages = 1;
s3.listObjects().eachPage(function(err, data) {
if (err) return;
console.log("Page", pages++);
console.log(data);
});
Source: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html

In AWS SQS, Attribute is visible in console and not able to read it programatically

I am trying to insert a token in SQS using "AWS Service Proxy" in Web API with Path Override given as below :
Account#/QueueName?Action=SendMessage&MessageAttribute.1.Name="TEST"&MessageAttribute.1.Value.StringValue="abcd"&MessageAttribute.1.Value.DataType=String&Version=2012-11-05&Expires=2100-05-05T22%3A52%3A43PST
I can see this attribute "TEST" created with value "abcd" in the console of SQS but when I try to retrieve it thru code, I am not able to get the attribute "TEST". Code I am using to retrieve is given below"
sqs.receiveMessage({
QueueUrl: sqsQueueUrl,
MaxNumberOfMessages: 3, // how many messages do we wanna retrieve?
VisibilityTimeout: 60, // seconds - how long we want a lock on this job
WaitTimeSeconds: 3, // seconds - how long should we wait for a message?
AttributeNames: ["All"]
}, function (err, data) {
// If there are any messages to get
if (data.Messages) {
// Get the first message (should be the only one since we said to only get one above)
var message = data.Messages[0],
body = JSON.parse(message.Body);
// Now this is where you'd do something with this message
context.done("Message Found");
}
});
Please let me know what I am missing. Thanks in advance.
Nowhere in your code are you attempting to get the value of the attributes. You are simply accessing message.Body. You need to access message.Attributes.