Can Dynamodb check the items regularly instead of use a schedule cloudwatch event to trigger a lambda to scan the table?
Or to say does Dynamodb has any functions so it can check the table itself for example is the item in "count" column is bigger than 5 and trigger a lambda?
The short answer is no!
DynamoDB is a database. It stores data. At this date it does not have embedded functions like store procedures or triggers that are common in relational databases. You can however use DynamoDB streams to implement a kind of a trigger.
DynamoDB streams could be used to start a lambda function with the old data, new data or the old and the new data of the item updated/created in a table. You can then use the lambda to check for your count column and if it is greater than 5 call another lambda or do the procedure that you need.
Related
Need to move data from one dynamodb table to another table after doing a transformation
What is the best approach to do that
Do I need to write a script to read selective data from one table and put in another table
or Do I need to follow CSV export
You need to write a script to do so. However, you may wish to first export the data to S3 using DynamoDB's native function as it does not impact capacity on the table, ensuring you do not impact production traffic for example.
If your table is not serving production traffic or the size of the table is not too large then you can simply use Lambda functions to read your items, transform and then write to the new table.
If your table is large, you can use AWS Glue to achieve the same result in a distributed fashion.
Is this a live table that is used on prod?
If it is what I usually do is.
Enable Dynamo streams (if not already enabled)
Create a lambda function that has access to both tables
Place transformation logic in the lambda
Subscribe the lambda to the dynamo stream
Update all fields on the original table (like update a new field called 'migrate')
Now all elements will flow through the lambda and it can store them with transformation on the new table
You can now switch to the new table
Check if everything still works
Delete lambda, old table, and disable dynamo streams (if needed)
This approach is the only one I found that can guarantee 100% uptime during the migration.
If the table is not live then you can just export it to S3 and then import it into the new table
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBPipeline.html
I have a streaming app that is putting data actively into DynamoDB.
I want to store the last 100 added items and delete the older ones; it seems that the TTL feature will not work in this case.
Any suggestions?
There is no feature within Amazon DynamoDB that enforces only keeping the last n items.
Limit 100 items as the maximum within your application by perhaps storing and keeping a running counter.
I'd do this via a lambda function with a trigger on the DynamoDB in question.
The lambda would then delete the older entries each time a change is made to the table. You'd need some sort of highwater mark for the table items and some way to keep track of it. I'd have this in a secondary DynamoDB table. Each new item put to the DynamoDB item table would get that HWM add it as a field to the item and update it. Basically implementing an autoincrement field, as they don't exist in DynamoDB. Then the lambda function could delete any items with an autoincrement id that is HWM - 100 or less.
There may be better ways but this would achieve the goal.
I have a DynamoDB table whose items have these attributes: id, user, status. Status can take values A or B.
Is it possible to trigger a lambda based on only the value of attribute 'status' ?
Example, trigger the lambda when a new item is added to DDB with status == A or when the status of an existing item is updated to A.
(I am looking into DynamoDB streams for achieving this, but I have not come across an example where anyone is using it for this use case.)
Is it possible to monitor a DDB based on value of a certain attribute ?
Example, when status == B, I don't want to trigger lambda, but only emit a metrics for that row. Basically, I want to have a metrics to see how many items in the table have status == B at a given point.
If not from DynamoDB , are the above two possible for any other storage type ?
Yes, as your initial research has uncovered, this is something you'll want to use DynamoDB Streams for.
You can trigger a lambda function based on an item being written, updated, or removed from Dynamo DB, and you can configure your stream subscription to filter on only attributes and values you care about.
DynamoDB recently introduced the ability to filter stream events before invoking your function, you can read more about how that works and how to configure it here
For more information about DynamoDB Stream use cases, this post may be helpful.
I have this flow in which we have to persist in DynamoDb some items for a specific time. After the items has expired, we have to call some other services, to notify them that data got expired.
I was thinking about two solutions:
1) Move expiry check to Java logic:
Retrieve DynamoDb data in batches, verify the expiry items in Java, and after that delete the data in batches, and notify other services.
There are some limitations:
BatchGetItem let you retrieve max 100 items.
BatchWriteItem let you delete max 25 items
2) Move expiry check to the db logic:
Query the DynamoDb, in order to check which items has expired(and delete them), and return the id's to the client, in order for us to notify other services.
Again, there are some limitations:
The result set from a Query is limited to 1 MB per call.
For both solutions, there will be a job, that will be run periodically, or we're going to use some aws lambda that will be triggered periodically and will call an endpoint from our app that is going to delete the item from db and notify other services.
My question is if DynamoDb is proper for my case, or should I use some relational db that doesn't have these kind of limitations like Mysql? What do you think ? Thanks!
Have you considered using the DynamoDB TTL feature? This allows you to create a time-based column in your table that DynamoDB will use to automatically delete the items based on the time value.
This requires no implementation on your part and no polling, querying, or batching limitations. You will need to populate a TTL column but you may already have that information present if you are rolling your own expiration logic.
If other services need to be notified when a TTL event occurs, you can create a Lambda that processes a DynamoDB stream and take action when a TTL delete event occurs.
I'm a beginner of Dynamodb and Dynamodb table stream. I have already created AWS Lambda and enabled DynamoDB stream with trigger that invokes my lambda for every added/updated/delete record. Now I want to perform initial sync operation for all my existing records. How can I do this?
Is there any way to make all existing records in a table to be "reprocessed" and added to stream (so they can be processed by my lambda)?
Do I have to write a custom script?
To my knowledge there is no way to do this without writing some custom script.
You could for instance write a script that reads every current item out of the table and then writes it back in overwriting itself and putting a new entry in the stream that would then be handled by your existing Lambda.
Another option is to not try and use the stream in any way for the existing items in the table. Leave the steam and Lambda as is for all future writes to the table and write a script that goes through all the existing items and processes them accordingly.
I think, by creating another Lambda and setting the startingPosition as TRIM_HORIZON, you will be able to get all records again from the stream.
https://docs.aws.amazon.com/lambda/latest/dg/with-ddb.html#services-dynamodb-eventsourcemapping