DynamoDB's transaction write states:
Multiple transactions updating the same items simultaneously can cause conflicts that cancel the transactions. We recommend following DynamoDB best practices for data modeling to minimize such conflicts.
If there are multiple simultaneous TransactionWriteItems on the same item simultaneously, will all the transaction write request fail with TransactionCanceledException? Or at least one request will succeed?
This purely depends on how the conflicts happen. So, if all the writes in a transaction get successful, it will be successful. If some other write/trasaction midifies one of the items, it will fail.
Related
My team is working on an AWS Lambda function that has a configured timeout of 30 seconds. Given that lambdas have this timeout constraint and the fact that they can be reused for subsequent requests, it seems like there will always be the potential for the function's execution to timeout prior to completing all of its necessary steps. Is this a correct assumption? If so, how do we bake in resiliency so that db updates can be rolled back in the case of a timeout occurring after records have been updated, but a response hasn't been returned to the function's caller?
To be more specific, my team is managing a Javascript-based lambda (Node.js 16.x) that sits behind an Api Gateway and is an implementation of a REST method to retrieve and update job records. The method works by retrieving records from DynamodDB given certain conditions, updates their states, then returns the updated job records to the caller. Is there a means to detect when a timeout has occurred and to rollback (either manually or automatically) the updated db records so that they're in the same state as when the lambda began execution?
It is important to consider the consequences of what you are trying to do here. Instead of finding ways to detect when your Lambda function is about to expire, the best practice is to first monitor a good chunk of executed requests and analyze how much time, on average, it takes to complete the said requests. Perhaps 30 seconds may not be enough to complete the transaction implemented as a Lambda function.
Once you work with an admittable timeout that suits the average execution time for requests, you can minimize the possibility of rollbacks because of incomplete executions with the support for transactions in DynamoDB. It allows you to group multiple operations together and submit them as a single all-or-nothing, thus ensuring atomicity.
Another aspect related to the design of your implementation is about how fast can you retrieve data from DynamoDB without compromising the timeout. Currently, your code retrieves records from DynamoDB and then updates them if certain conditions are met. This creates a need for this read to happen as fast as possible so the subsequent operation of update can start. A way for you to speed up this read is enabling the DAX (DynamoDB Accelerator) to achieve in-memory acceleration. This acts as a cache for DynamoDB with microseconds of latency.
Finally, if you wat to be extra careful and not even start a transaction in DynamoDB because there will be not enough time to do so, you can use the context object from the Lambda API to query for the remaining time of the function. In Node.js, you can do this like this:
let remainingTimeInMillis = context.getRemainingTimeInMillis()
if (remainingTimeInMillis < TIMEOUT_PASSED_AS_ENVIRONMENT_VARIABLE) {
// Cancel the execution and clean things up
}
The use-case is to keep track of exact number of items in a warehouse.
The warehouse has incoming items from multiple customer and the warehouse has to keep track of the item count per customer so that the warehouse owner knows the accurate count of items per customer.
So, if we were to use a QLDB to increment item_count per customer_id as and when they enter teh warehouse, would the QLDB be able to handle multi-item transaction?
If there was a read, write inconsistency, would the write to QLDB fail? We want the writes to be consistent but we are okay to read T1's data if the current data is at T2.
Short answer: yes.
QLDB supports transactions under OCC. Each transaction can have multiple statements. These statements can query the current state of the ledger to determine if the transaction can proceed. If it can, keep issuing statements until you are ready to commit. Your commit will be rejected if any other transaction interfered with it (the transaction must be serializable).
I'm not sure how to achieve consistent read across multiple SELECT queries.
I need to run several SELECT queries and to make sure that between them, no UPDATE, DELETE or CREATE has altered the overall consistency. The best case for me would be something non blocking of course.
I'm using MySQL 5.6 with InnoDB and default REPEATABLE READ isolation level.
The problem is when I'm using RDS DataService beginTransaction with several executeStatement (with the provided transactionId). I'm NOT getting the full result at the end when calling commitTransaction.
The commitTransaction only provides me with a { transactionStatus: 'Transaction Committed' }..
I don't understand, isn't the commit transaction fonction supposed to give me the whole (of my many SELECT) dataset result?
Instead, even with a transactionId, each executeStatement is returning me individual result... This behaviour is obviously NOT consistent..
With SELECTs in one transaction with REPEATABLE READ you should see same data and don't see any changes made by other transactions. Yes, data can be modified by other transactions, but while in a transaction you operate on a view and can't see the changes. So it is consistent.
To make sure that no data is actually changed between selects the only way is to lock tables / rows, i.e. with SELECT FOR UPDATE - but it should not be the case.
Transactions should be short / fast and locking tables / preventing updates while some long-running chain of selects runs is obviously not an option.
Issued queries against the database run at the time they are issued. The result of queries will stay uncommitted until commit. Query may be blocked if it targets resource another transaction has acquired lock for. Query may fail if another transaction modified resource resulting in conflict.
Transaction isolation affects how effects of this and other transactions happening at the same moment should be handled. Wikipedia
With isolation level REPEATABLE READ (which btw Aurora Replicas for Aurora MySQL always use for operations on InnoDB tables) you operate on read view of database and see only data committed before BEGIN of transaction.
This means that SELECTs in one transaction will see the same data, even if changes were made by other transactions.
By comparison, with transaction isolation level READ COMMITTED subsequent selects in one transaction may see different data - that was committed in between them by other transactions.
Using transactWriteItems in the aws-sdk (js) we get a TransactionCanceledException. The reason within that Exception is given as TransactionConflict. Sometimes all actions in the transaction fail, sometimes only a few or only one. We do run multiple transactions in parallel that can operate on the same items. The documentation doesn't mention this particular error. Possible reason excerpt:
A condition in one of the condition expressions is not met.
A table in the TransactWriteItems request is in a different account or
region.
More than one action in the TransactWriteItems operation targets the
same item.
There is insufficient provisioned capacity for the transaction to be
completed.
An item size becomes too large (larger than 400 KB), or a local
secondary index (LSI) becomes too large, or a similar validation error
occurs because of changes made by the transaction.
There is a user error, such as an invalid data format.
None of these apply and when retrying the transaction it seems to eventually work. Anyone knows about this exception? I can't find anything documented.
What you are experiencing is not a bug—it’s actually part of the feature, and it was mentioned in the launch announcement .
Items are not locked during a transaction. DynamoDB transactions provide serializable isolation. If an item is modified outside of a transaction while the transaction is in progress, the transaction is canceled and an exception is thrown with details about which item or items caused the exception.
As an aside, instead of locking, DynamoDB uses something called optimistic concurrency control (which is also (confusingly) called optimistic locking). If you’re interested in learning more about that, the Wikipedia article on Optimistic Concurrency Control is pretty good.
Back to the matter at hand, the AWS documentation for transactions says:
Multiple transactions updating the same items simultaneously can cause conflicts that cancel the transactions. We recommend following DynamoDB best practices for data modeling to minimize such conflicts.
Specifically for TransactWriteItems, they say:
Write transactions don't succeed under the following circumstances:
When an ongoing TransactWriteItems operation conflicts with a concurrent TransactWriteItems request on one or more items in the TransactWriteItems operation. In this case, the concurrent request fails with a TransactionCancelledException
Similarly for TransactGetItems:
Read transactions don't succeed under the following circumstances:
When there is an ongoing TransactGetItems operation that conflicts with a concurrent PutItem, UpdateItem, DeleteItem or TransactWriteItems request. In this case the TransactGetItems operation fails with a TransactionCancelledException
I have a DynamoDB table where I am using transactional writes (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/transactions.html). The transaction consists of 2 puts. Let's say the first put succeeds and the second fails. In this scenario, the first put will be rolled back by the transaction library.
I also have DynamoDB streams (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) enabled on the table and another application consumes from that stream.
Question: In the rollback scenario, will the first successful put result in a DynamoDB stream event and the rollback will result in another? If yes, is there is a way to prevent this, that is, to ensure that a stream event is triggered only for a fully completed transaction?
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/transaction-apis.html
Changes made with transactions are propagated to global secondary
indexes (GSIs), DynamoDB streams, and backups eventually, after the
transaction completes successfully. Because of eventual consistency,
tables restored from an on-demand or point-in-time-recovery (PITR)
backup might contain some but not all of the changes made by a recent
transaction.
So As I read it, you won't see anything in the stream till after the transaction completes successfully.