Not able to solve throttlingException in DynamoDB - amazon-web-services

I have a lambda function which does a transaction in DynamoDB similar to this.
try {
const reservationId = genId();
await transactionFn();
return {
statusCode: 200,
body: JSON.stringify({id: reservationId})
};
async function transactionFn() {
try {
await docClient.transactWrite({
TransactItems: [
{
Put: {
TableName: ReservationTable,
Item: {
reservationId,
userId,
retryCount: Number(retryCount),
}
}
},
{
Update: {
TableName: EventDetailsTable,
Key: {eventId},
ConditionExpression: 'available >= :minValue',
UpdateExpression: `set available = available - :val, attendees= attendees + :val, lastUpdatedDate = :updatedAt`,
ExpressionAttributeValues: {
":val": 1,
":updatedAt": currentTime,
":minValue": 1
}
}
}
]
}).promise();
return true
} catch (e) {
const transactionConflictError = e.message.search("TransactionConflict") !== -1;
// const throttlingException = e.code === 'ThrottlingException';
console.log("transactionFn:transactionConflictError:", transactionConflictError);
if (transactionConflictError) {
retryCount += 1;
await transactionFn();
return;
}
// if(throttlingException){
//
// }
console.log("transactionFn:e.code:", JSON.stringify(e));
throw e
}
}
It just updating 2 tables on api call. If it encounter a transaction conflict error, it simply retry the transaction by recursively calling the function.
The eventDetails table is getting too much db updates. ( checked it with aws Contributor Insights) so, made provisioned unit to a higher value than earlier.
For reservationTable Provisioned capacity is on Demand.
When I do load test over this api with 400 (or more) users using JMeter (master slave configuration) I am getting Throttled error for some api calls and some api took more than 20 sec to respond.
When I checked X-Ray for this api found that, DynamoDB is taking too much time for this transasction for the slower api calls.
Even with much fixed provisioning ( I tried on demand scaling too ) , I am getting throttled exception for api calls.
ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded.
Consider increasing your provisioning level with the UpdateTable API.
UPDATE
And one more thing. When I do the load testing, I am always uses the same eventId. It means, I am always updating the same row for all the api requests. I have found this article, which says that, a single partition can only have upto 1000 WCU. Since I am always updating the same row in the eventDetails table during load testing, is that causing this issue ?

I had this exact error and it helped me to change the On Demand to Provisioned under Read/write capacity mode. Try to change that, if that doesn't help, we'll go from there.

From the link you cite in your update, also described in an AWS help article here, it sounds like the issue is that all of your load testers are writing to the same entry in the table, which is going to be in the same partition, subject to the hard limit of 1,000 WCU.
Have you tried repeating this experiment with the load testers writing to different partitions?

Related

GCP Cloud Tasks: shorten period for creating a previously created named task

We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.

I am learning to create AWS Lambdas. I want to create a "chain": S3 -> 4 Chained Lambda()'s -> RDS. I can't get the first lambda to call the second

I really tried everything. Surprisingly google has not many answers when it comes to this.
When a certain .csv file is uploaded to a S3 bucket I want to parse it and place the data into a RDS database.
My goal is to learn the lambda serverless technology, this is essentially an exercise. Thus, I over-engineered the hell out of it.
Here is how it goes:
S3 Trigger when the .csv is uploaded -> call lambda (this part fully works)
AAA_Thomas_DailyOverframeS3CsvToAnalytics_DownloadCsv downloads the csv from S3 and finishes with essentially the plaintext of the file. It is then supposed to pass it to the next lambda. The way I am trying to do this is by putting the second lambda as destination. The function works, but the second lambda is never called and I don't know why.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_ParseCsv gets the plaintext as input and returns a javascript object with the parsed data.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_DecryptRDSPass only connects to KMS, gets the encrcypted RDS password, and passes it along with the data it received as input to the last lambda.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_PutDataInRds then finally puts the data in RDS.
I created a custom VPC with custom subnets, route tables, gateways, peering connections, etc. I don't know if this is relevant but function 2. only has access to the s3 endpoint, 3. does not have any internet access whatsoever, 4. is the only one that has normal internet access (it's the only way to connect to KSM), and 5. only has access to the peered VPC which hosts the RDS.
This is the code of the first lambda:
// dependencies
const AWS = require('aws-sdk');
const util = require('util');
const s3 = new AWS.S3();
let region = process.env;
exports.handler = async (event, context, callback) =>
{
var checkDates = process.env.CheckDates == "false" ? false : true;
var ret = [];
var checkFileDate = function(actualFileName)
{
if (!checkDates)
return true;
var d = new Date();
var expectedFileName = 'Overframe_-_Analytics_by_Day_Device_' + d.getUTCFullYear() + '-' + (d.getUTCMonth().toString().length == 1 ? "0" + d.getUTCMonth() : d.getUTCMonth()) + '-' + (d.getUTCDate().toString().length == 1 ? "0" + d.getUTCDate() : d.getUTCDate());
return expectedFileName == actualFileName.substr(0, expectedFileName.length);
};
for (var i = 0; i < event.Records.length; ++i)
{
var record = event.Records[i];
try {
if (record.s3.bucket.name != process.env.S3BucketName)
{
console.error('Unexpected notification, unknown bucket: ' + record.s3.bucket.name);
continue;
}
if (!checkFileDate(record.s3.object.key))
{
console.error('Unexpected file, or date is not today\'s: ' + record.s3.object.key);
continue;
}
const params = {
Bucket: record.s3.bucket.name,
Key: record.s3.object.key
};
var csvFile = await s3.getObject(params).promise();
var allText = csvFile.Body.toString('utf-8');
console.log('Loaded data:', {Bucket: params.Bucket, Filename: params.Key, Text: allText});
ret.push(allText);
} catch (error) {
console.log("Couldn't download CSV from S3", error);
return { statusCode: 500, body: error };
}
}
// I've been randomly trying different ways to return the data, none works. The data itself is correct , I checked with console.log()
const response = {
statusCode: 200,
body: { "Records": ret }
};
return ret;
};
While this shows how the lambda was set up, especially its destination:
I haven't posted on Stackoverflow in 7 years. That's how desperate I am. Thanks for the help.
Rather than getting each Lambda to call the next one take a look at AWS managed service for state machines, step functions which can handle this workflow for you.
By providing input and outputs you can pass output to the next function, with retry logic built into it.
If you haven't much experience AWS has a tutorial on setting up a step function through chaining Lambdas.
By using this you also will not need to account for configuration issues such as Lambda timeouts. In addition it allows your code to be more modular which improves testing the individual functionality, whilst also isolating issues.
The execution roles of all Lambda functions, whose destinations include other Lambda functions, must have the lambda:InvokeFunction IAM permission in one of their attached IAM policies.
Here's a snippet from Lambda documentation:
To send events to a destination, your function needs additional permissions. Add a policy with the required permissions to your function's execution role. Each destination service requires a different permission, as follows:
Amazon SQS – sqs:SendMessage
Amazon SNS – sns:Publish
Lambda – lambda:InvokeFunction
EventBridge – events:PutEvents

How to identify the first execution of a Lambda version at runtime?

I want to run some code only on the first execution of a Lambda version. (NB: I'm not referring to a cold start scenario as this will occur more than once).
computeInstanceInvocationCount is unfortunately reset to 0 upon every cold start.
functionVersion is an available property but unless I store this in memory outside the lambda I cannot calculate if it is indeed the first execution.
Is it possible to deduce this based on runtime values in event or context? Or is there any other way?
There is no way of knowing if this is the first time that a Lambda has ever run from any information passed into the Lambda.
You would have to include functionality to check elsewhere by setting a flag or parameter there, remember though that multiple copies of the Lambda could be invoked at the same time so any data store for this would presumably need to be transactional to ensure that it occurs only once.
One way that you can try is to use AWS parameter store.
On every deployment update the parameter store value with
{"version":"latest","is_firsttime":true}
So run the below command after deployment
aws secretsmanager update-secret --secret-id demo --secret-string '{"version":"latest","is_firsttime":true}'
So this is something that we need to make sure before deployment.
Now we can set logic inside the lambda, in the demo we will look into is_firsttime only.
var AWS = require('aws-sdk'),
region = "us-west-2",
secretName = "demo",
secret,
decodedBinarySecret;
var client = new AWS.SecretsManager({
region: region
});
client.getSecretValue({SecretId: secretName}, function(err, data) {
secret = data.SecretString;
secret=JSON.parse(secret)
if ( secret.is_firsttime == true)
{
console.log("lambda is running first time")
// any init operation here
// init completed, now we are good to set back it `is_firsttime` to false
var params = {
Description: "Init completeed, updating value at date or anythign",
SecretId: "demo",
SecretString : '[{ "version" : "latest", "is_firsttime": false}]'
};
client.updateSecret(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
}
else{
console.log("init already completed");
// rest of logic incase of not first time
}
})
This is just a demo code that will work in a non-lambda environment, adjust it accordingly.
Expected response for first time
{
ARN: 'arn:aws:secretsmanager:us-west-2:12345:secret:demo-0Nlyli',
Name: 'demo',
VersionId: '3ae6623a-1111-4a41-88e5-12345'
}
Second time
init already completed

Is there a reason why large JSON files (4000 objects) don't write to dynamodb but small files (10 objects) works

Getting a Task timed out after 3.01 seconds message in cloudwatch when trying to read large JSON(66mb) from S3 bucket and write data to dynamodb.
Smaller JSON files are reading and writing to my dynamodb table but when the JSON file contains a larger amount of objects (4000 objects, 66MB file) in this instance, the lambda function just returns Task timed out after 3.01 seconds.
const AWS = require('aws-sdk');
const s3 = new AWS.S3();
const documentClient = new AWS.DynamoDB.DocumentClient( {
convertEmptyValues: true
} );
exports.handler = async (event) => {
const{name} = event.Records[0].s3.bucket;
const{key} = event.Records[0].s3.object;
const params = {
Bucket: name,
Key: key
}
try {
const data = await s3.getObject(params).promise();
const carsStr = data.Body.toString();
const usersJSON = JSON.parse(carsStr);
console.log(`USERS ::: ${carsStr}`);
for (var i = 0; i < usersJSON.length; i++) {
var record = usersJSON[i];
console.log("Inserting record: " + record);
var putParams = {
Item: record,
ReturnConsumedCapacity: "TOTAL",
TableName: "cars"
};
await documentClient.put(putParams).promise();
}
} catch(err) {
console.log(err);
}
};
AWS Lambda default timeout is 3 seconds and that is the reason you see a timeout error in the logs. You can increase it as per your need and up to a maximum of 900 seconds
As per official documentation
Timeout – The amount of time that Lambda allows a function to run
before stopping it. The default is 3 seconds. The maximum allowed
value is 900 seconds.
Note: Increasing timeout is surely a solution for task that requires longer execution time. But always consider code optimization before increasing the timeout.
Instead of using the Lambda to write to DynamoDB, populate your table on your development machine.
If you are expecting large JSON files coming into your S3 bucket often, to trigger the Lambda, you need to increase the default timeout of your function. The maximum timeout is 15 minutes. The default timeout is 3 seconds.
You can also increase the number of DynamoDB Write Capacity Units on your table. A write capacity unit represents one write per second, for an item up to 1 KB in size. For example, suppose that you create a table with 10 write capacity units. This allows you to perform 10 writes per second, for items up to 1 KB in size per second.
Documentation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ProvisionedThroughput.html#ProvisionedThroughput.CapacityUnits.Write

Implementation of Atomic Transactions in dynamodb

I have a table in dynamodb, where I need to update multiple related items at once(I can't put all data in one item because of 400kb size limit).
How can I make sure that either multiple rows are updated successfully or none.
End goal is to read consistent data after update.
On November 27th, 2018, transactions for Dynamo DB were announced. From the linked article:
DynamoDB transactions provide developers atomicity, consistency, isolation, and durability (ACID) across one or more tables within a single AWS account and region. You can use transactions when building applications that require coordinated inserts, deletes, or updates to multiple items as part of a single logical business operation. DynamoDB is the only non-relational database that supports transactions across multiple partitions and tables.
The new APIs are:
TransactWriteItems, a batch operation that contains a write set, with one or more PutItem, UpdateItem, and DeleteItem operations. TransactWriteItems can optionally check for prerequisite conditions that must be satisfied before making updates. These conditions may involve the same or different items than those in the write set. If any condition is not met, the transaction is rejected.
TransactGetItems, a batch operation that contains a read set, with one or more GetItem operations. If a TransactGetItems request is issued on an item that is part of an active write transaction, the read transaction is canceled. To get the previously committed value, you can use a standard read.
The linked article also has a JavaScript example:
data = await dynamoDb.transactWriteItems({
TransactItems: [
{
Update: {
TableName: 'items',
Key: { id: { S: itemId } },
ConditionExpression: 'available = :true',
UpdateExpression: 'set available = :false, ' +
'ownedBy = :player',
ExpressionAttributeValues: {
':true': { BOOL: true },
':false': { BOOL: false },
':player': { S: playerId }
}
}
},
{
Update: {
TableName: 'players',
Key: { id: { S: playerId } },
ConditionExpression: 'coins >= :price',
UpdateExpression: 'set coins = coins - :price, ' +
'inventory = list_append(inventory, :items)',
ExpressionAttributeValues: {
':items': { L: [{ S: itemId }] },
':price': { N: itemPrice.toString() }
}
}
}
]
}).promise();
You can use an API like this one for Java, http://aws.amazon.com/blogs/aws/dynamodb-transaction-library/. The transaction library API will help you manage atomic transactions.
If you're using node.js, there are other solutions for that using an atomic counter or conditional writes. See answer here, How to support transactions in dynamoDB with javascript aws-sdk?.