Trigger Lambda function based on attribute value in DynamoDB - amazon-web-services

I have a DynamoDB table whose items have these attributes: id, user, status. Status can take values A or B.
Is it possible to trigger a lambda based on only the value of attribute 'status' ?
Example, trigger the lambda when a new item is added to DDB with status == A or when the status of an existing item is updated to A.
(I am looking into DynamoDB streams for achieving this, but I have not come across an example where anyone is using it for this use case.)
Is it possible to monitor a DDB based on value of a certain attribute ?
Example, when status == B, I don't want to trigger lambda, but only emit a metrics for that row. Basically, I want to have a metrics to see how many items in the table have status == B at a given point.
If not from DynamoDB , are the above two possible for any other storage type ?

Yes, as your initial research has uncovered, this is something you'll want to use DynamoDB Streams for.
You can trigger a lambda function based on an item being written, updated, or removed from Dynamo DB, and you can configure your stream subscription to filter on only attributes and values you care about.
DynamoDB recently introduced the ability to filter stream events before invoking your function, you can read more about how that works and how to configure it here
For more information about DynamoDB Stream use cases, this post may be helpful.

Related

Why do I receive two events after a update on dynamodb?

I have configured dynamodb stream to trigger my lambda. When I update an item on dynamodb table, I see my lambda is triggered twice with two different event. The NewImage and OldImage are same in these two events. They are only different in eventID, ApproximateCreationDateTime, SequenceNumber etc.
And there is only 1 million second different based on the timestamp.
I updated the item via dynamodb console which means there should be only one action happened. Otherwise, it is impossible to update item twice within 1 million second via console.
Is it expected to see two events?
This would not be expected behaviour.
If you're seeing 2 separate events this would indicate 2 separate actions occurred. As theres a different time this indicates a secondary action has occurred.
From the AWS Documentation the following is true
DynamoDB Streams helps ensure the following:
Each stream record appears exactly once in the stream.
For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.
This will likely be related to your application, ensure that you're not using multiple writes where you think there might be a single.
Also check your CloudTrail to see whether there are multiple API calls that you can see. I would imagine if you're using global tables there's a possibility of seeing a secondary api call as the contents of the item would be modified by the DynamoDB service.

Finding expired data in aws dynamoDB

I have a requirement where I need to store some data in dynamo-db with a status and a timestamp. Eg. <START, 20180203073000>
Now, above status flips to STOP when I receive a message in SQS. But, to make my system error-proof, I need some mechanism through which I can identify whether a data having START status present in dynamo-db is older than 1 day then set it's status to STOP. So that, it may not wait indefinitly for the message to arrive from SQS.
Is there an aws feature which I can use to achieve this, without polling for data at regular interval ?
Not sure if this will fit your needs, but here is one possibility:
Enable TTL on your DynamoDB table. This will work if your timestamp
data attribute is a Number data type containing time in epoch
format. Once the timestamp expires, the corresponding item is
deleted from the table in the background.
Enable Streams on your DynamoDB table. Items that are deleted by TTL
will be sent to the stream.
Create Trigger that connects DynamoDB stream to Lambda function. In your case the
trigger will receive your entire deleted item.
Modify your record (set 'START' to 'STOP'), remove your timestamp attribute (items with no TTL attribute are not deleted) and re-insert into the table.
This way you will avoid the table scans searching for expired items, but on other side there might be cost associated with lambda execution.
You can try creating a GSI using the status as primary key and timestamp as sort key. When querying for expired items, use a condition expression like status = "START" and timestamp < 1-day-ago.
Be careful though, because this basically creates 2 hot partitions (START and STOP), so make sure the projection expression only has the data you need and no more.
If you have a field that's set on the status = START state but doesn't exist otherwise, you'd be able to take advantage of a sparse index (basically, DynamoDB won't index any items in a GSI if the GSI keys don't exist on the item, so you don't need to filter them on query)

Find whether the value has been updated or inserted in Dynamodb?

I am using updatedItem() function which is either inserting or updating values in Dynamodb. If values are updating I want to fetch those items and invoke a new lambda function. How can I achieve this?
The most direct approach would be to add ReturnValues: 'UPDATED_NEW' to the params you use for you updateItem() call.
You can then tell if you're inserted a new item because the returned Attributes will include your partition (and sort, if you've used a composite) key.
This is because you cannot change the key of an item, so if all you've done is update an item, then you would not have updated its key. But if you have created a new item, then you would have 'updated' its key.
However, if you want to react to items being updated in a dynamo table, you could alternatively use DynamoDB Streams (docs).
These streams allow you to trigger lambdas on the transactions on the dynamo table. This lambda could then filter the events for updates and react accordingly. The advantage of this architectural approach is it means your 'onUpdate' functionality will trigger if anything updates the table- not just the specific lambda.
Don't think previously accepted answer works. The return Attributes never returned the partition/sort keys, whether an item was updated or created.
What worked for me was to add ReturnValues: UPDATED_OLD. If the returned Attributes is undefined, then you know that an item was created.

Can Dynamodb check the items regularly?

Can Dynamodb check the items regularly instead of use a schedule cloudwatch event to trigger a lambda to scan the table?
Or to say does Dynamodb has any functions so it can check the table itself for example is the item in "count" column is bigger than 5 and trigger a lambda?
The short answer is no!
DynamoDB is a database. It stores data. At this date it does not have embedded functions like store procedures or triggers that are common in relational databases. You can however use DynamoDB streams to implement a kind of a trigger.
DynamoDB streams could be used to start a lambda function with the old data, new data or the old and the new data of the item updated/created in a table. You can then use the lambda to check for your count column and if it is greater than 5 call another lambda or do the procedure that you need.

How to update multiple columns of dynamo db using aws iot rules engine

I have set of data: id, name, height and weight.
I am sending this data to aws iot in json format. From there I need to update the respective columns in a dynamo db hence I have created 3 rules to update name, height and weight keeping id as partition key.
But when I send the message only one column is getting updated. If I disable any 2 rules then the remaining rule works fine. Therefore every time I update, columns are getting overwritten.
How can I update all three columns from the incoming message?
Another answer: in your rule, use instead the "dynamoDBv2" action -- which "allows you to write all or part of an MQTT message to a DynamoDB table. Each attribute in the payload is written to a separate column in the DynamoDB database ..."
dynamoDBv2 action: writes each attribute in the payload to a separate column in the DynamoDB database.
The answer is: You can't do this with the IoT gateway rules themselves. You can only store data in a single column through the rules (apart from the hash and sort key).
A way around this is to make a lambda rule which calls for example a python script which then takes the message and stores it in the table. See also this other SO question.