AWS DynamoDb with Optimistic Locking and Batch Write Operations - amazon-web-services

Can we use DynamoDb optimistic locking with batchWriteItem request? AWS docs on Optimistic locking mention that a ConditionalCheckFailedException is thrown when the version value is different while updating the request. In case of batchWriteItem request, will the whole batch fail or only that record with a different version value? Will the record that failed due to different version value be returned as an unprocessed record?

You cannot. You can be sure by looking at the low level syntax and notice there’s no ability to specify a condition expression.
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

Related

Is there any notification event I can trace for completion of an execution of AWS S3 lifecycle rule?

I wanted to delete large number of S3 files (may be few 100K or 1000K, which I do not have control) in a bulk async process. I tried to look into multiple blogs and collated below strategies:
Leverage AWS S3 REST API from the async thread of custom application
Here the drawbacks are:
I will have to make huge number of S3 API calls as 1 request is limited for 1000 S3 objects and I may not know the exact S3 object.
Even if I identify the S3 objects to delete, I will have to first GET and then DELETE which will make the solution costly.
Here I will have to keep track of deleted chunks and in case of any failure in middle of operation, I will have to build a mechanism to re-trigger the chunks which failed to be deleted.
Leveraging S3 lifecycle policy
Here the drawbacks are:
We are storing multiple customer data into same bucket segregated by customer-id in prefix. With growing number of customers, we foresee that the 1000 rules per bucket hard limit may hit us.
To surpass above drawback, we can delete the rule and free-up the quota for next requests. But we were looking for any event based notification which can tell us back that the bulk delete operation is complete.
Again with growing number of customers, here we may loose predictability of the bulk delete operation. This is because of accumulated jobs due to reached quota limit and a submitted bulk delete job may have to wait for days to be completed.
Create only 1 rule with a special bulk delete tag and use it to set 1 S3 lifecycle policy
With this approach, we believe we will not hit the limit issue as we are expecting in above approach. And as we understood that these S3 lifecycle rules gets executed once a day (though we don't know exactly when), so we are assured that in max next 24h, the rule will get triggered and then it will take some time to actually complete the bulk delete operation (may be few mins or hours, we don't know). Here also we have the open question as: Is there a notification event after completion of 1 execution of S3 lifecycle rule which we can listen and update the status of all submitted bulk delete jobs as DONE? In lack of such notification event, it becomes difficult to let transparently communicate it back to the end-user who triggered the bulk delete async operation.
Any comments/advice on below strategies will be helpful. Also if you can help me with the answer for the last strategy which I guess is the most preferable choice I have as of now.
I tried all the above stated strategies and got stuck at the mentioned problem for each. Any inputs/advice on above will be of great help.
After all evaluations, we have finalized to go with codeful delete relevant data for specific time-range as an async java process leveraging S3 bulk delete SDK (DeleteObjectsRequest).

Resume reading from kinesis after a KCL consumer outage [duplicate]

I can't find in the formal documentation of AWS Kinesis any explicit reference between TRIM_HORIZON and the checkpoint, and also any reference between LATEST and the checkpoint.
Can you confirm my theory:
TRIM_HORIZON - In case the application-name is new, then I will read all the records available in the stream. Else, application-name was already used, then I will read from my last checkpoint.
LATEST - In case the application-name is new, then I will read all the records in the stream which added after I subscribed to the stream. Else, application-name was already used, I will read messages from my last checkpoint.
The difference between TRIM_HORIZON and LATEST is only in case the application-name is new.
AT_TIMESTAMP
-- from specific time stamp
TRIM_HORIZON
-- all the available messages in Kinesis stream from the beginning (same as earliest in Kafka)
LATEST
-- from the latest messages , i.e current message that just came into Kinesis/Kafka and all the incoming messages from that time onwords
From GetShardIterator documentation (which lines up with my experience using Kinesis):
In the request, you can specify the shard iterator type AT_TIMESTAMP to read records from an arbitrary point in time, TRIM_HORIZON to cause ShardIterator to point to the last untrimmed record in the shard in the system (the oldest data record in the shard), or LATEST so that you always read the most recent data in the shard.
Basically, the difference is whether you want to start from the oldest record (TRIM_HORIZON), or from "right now" (LATEST - skipping data between latest checkpoint and now).
The question clearly asks how these options relate to the checkpoint. However, none of the existing answers addresses the checkpoint at all.
An authoritative answer to this question by Justin Pfifer appears in a GitHub issue here.
The most relevant portion is
The KCL will always use the value in the lease table if it's present. It's important to remember that Kinesis itself doesn't track the position of consumers. Tracking is provided by the lease table. Leases in the KCL server double duty. They provide both mutual exclusion, and position tracking. So for mutual exclusion a lease needs to be created, and to satisfy the position tracking an initial value must be selected.
(Emphasis added by me.)
I think choosing between either is a trade off between do you want to start from the most recent data or do you want to start from the oldest data that hasnt been processed from kinesis.
Imagine a scenario when there is a bug in your lambda function and it is throwing an exception on the first record it gets and returns an error back to kinesis because of which now none of the records in your kinesis are going to be processed and going to remain there for 1 day period(retention period). Now after you have fixed the bug and deploy your lambda now your lambda will start getting all those messages from the buffer that kinesis has been holding up. Now your downstream service will have to process old data instead of the most recent data. This could add unwanted latency in your application if you choose TRIM_HIROZON.
But if you used LATEST, you can ignore all those previous stuck messages and have your lambda actually start processing from new events/messages and thus improving the latency your system provides.
So you will have to decide which is more important for your customers. Is losing a few data points fine and what is your tolerance limit or you always want accurate results like calculating sum/counter.

AWS SimpleDB getAttributes consistency

The docs for the Java SDK for SimpleDB say the following regarding the getAttributes(GetAttributesRequest request) operation:
If the item does not exist on the replica that was accessed for this operation, an empty set is returned. The system does not return an error as it cannot guarantee the item does not exist on other replicas.
I understand that I can use GetAttributesRequest#setConsistentRead(true), but the docs for the operation don't mention that, and that comment above is worrying me. It seems to suggest that an item request will succeed, if you're lucky!
Is this just an omission from the docs? Am I guaranteed to get back an item using a getAttributes request if I explicitly ask for a consistent read? (provided, of course, that a previous operation successfully put/updated the item).

What is the AWS Dynamo DB write consistence?

I know that AWS Dynamo DB read has eventually and strongly consistence. And I read a document it says that The individual PutItem and DeleteItem operations specified in BatchWriteItem are atomic; however BatchWriteItem as a whole is not.
But I still don't understand how is the write method behavior is synchronized or not.
If this is an awkward question, please tell me.
BatchWriteItem is a batch API - meaning it allows you to specify a number of different operations to be submitted to Dynamo for execution in the same request. So when you submit a BatchItemRequest you are asking DynamoDB to perform a number of either PutItem or DeleteItem requests for you.
The claim that the individual PutItem and DeleteItem requests are atomic means that each of those is atomic with respect to other requests that may be wanting to modify the same item (identified by it's partition/sort keys) - meaning it's not possible for data corruption to occur within the item because two PutItem requests executed at the same time each modifying some part of the item and thus leaving it in an inconsistent state.
But then, the claim is that the whole BatchWriteItem request is not atomic. That just means that the sequence of PutItem and/or DeleteItem requests is not guaranteed to be isolated, so you could have other PutItem or DeleteItem requests - whether single or batch execute at the same time as the BatchWriteItem request which could affect the state of the table(s) in-between the individual PutItem/DeleteItem requests that make up the batch.
To Illustrate the point, let's say you have a BatchItemRequest that consists of the following 2 calls:
PutItem (partitionKey = 1000; name = 'Alpha'; value = 100)
DeleteItem (partitionKey = 1000)
And that at approximately the same time you've submitted this request there is another request that has the following operation in it:
DeleteItem (partitionKey = 1000)
It is possible that the second delete item request might delete the item before the first request executes and so while the PutItem succeeds, the DeleteItem in the first request would fail with a not found because the item has been deleted by the other delete request. This is one example of how the whole batch operation is not atomic.

Amazon S3 conditional put object

I have a system in which I get a lot of messages. Each message has a unique ID, but it can also receives updates during its lifetime. As the time between the message sending and handling can be very long (weeks), they are stored in S3. For each message only the last version is needed. My problem is that occasionally two messages of the same id arrive together, but they have two versions (older and newer).
Is there a way for S3 to have a conditional PutObject request where I can declare "put this object unless I have a newer version in S3"?
I need an atomic operation here
That's not the use-case for S3, which is eventually-consistent. Some ideas:
You could try to partition your messages - all messages that start with A-L go to one box, M-Z go to another box. Then each box locally checks that there are no duplicates.
Your best bet is probably some kind of database. Depending on your use case, you could use a regular SQL database, or maybe a simple RAM-only database like Redis. Write to multiple Redis DBs at once to avoid SPOF.
There is SWF which can make a unique processing queue for each item, but that would probably mean more HTTP requests than just checking in S3.
David's idea about turning on versioning is interesting. You could have a daemon that periodically trims off the old versions. When reading, you would have to do "read repair" where you search the versions looking for the newest object.
Couldn't this be solved by using tags, and using a Condition on that when using PutObject? See "Example 3: Allow a user to add object tags that include a specific tag key and value" here: https://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html#tagging-and-policies