I am using dynamoDB for a project. I have a use case where I maintain timeline for objects i.e. start and end time for an object and start time for next object. New objects can be added in between two existing objects(o1 & o2) in which I will have to update start time for next object in o1 and start time for next object in new object as start time of o2. This can cause problem in case two new objects are being added in between two objects and would probably require transactions. Can someone suggest how this can be handled?
Update: My data model looks like this:
objectId(Hash Key), startTime(Sort Key), endTime, nextStartTime
1, 1, 5, 4
1, 4, 6, 8
1, 8, 10, 9
So, it's possible a new entry comes in whose start time is 5. So, in transaction I will have to update nextStartTime for second entry to 5 and insert a new entry after the second entry which contains nextStartTime as start time of third entry. During this another entry might come in which also has start time between second and third entry(say 7 for eg.). Now I want the two transactions to be isolated of each other. In traditional SQL DBs it would be possible as second entry would be locked for the duration of transaction but Dynamo doesn't lock the items. So, I am wondering if I use transaction would the two transactions protect the data integrity.
DynamoDB supports optimistic locking. This is achieved via conditional writes.
You can do it manually by introducing a version attribute or you can use the one provided (hopefully) by your SDK. Here is a link to AWS docs.
TLDR
two objects have to update the same timeline at the same time
one will succeed the other will fail with a specific error
you will have to retry the failing one
Dynamo also has transactions. However, they are limited to 25 elements and consume 2x capacity units. If you can get away with an optimistic lock go for it.
Hope this was helpful
Update with more info on transactions
From this doc
Error Handling for Writing Write transactions don't succeed under the
following circumstances:
When a condition in one of the condition expressions is not met.
When a transaction validation error occurs because more than one
action in the same TransactWriteItems operation targets the same item.
When a TransactWriteItems request conflicts with an ongoing
TransactWriteItems operation on one or more items in the
TransactWriteItems request. In this case, the request fails with a
TransactionCanceledException.
When there is an insufficient provisioned capacity for the transaction to
be completed.
When an item size becomes too large (larger than 400 KB), or a local
secondary index (LSI) becomes too large, or a similar validation error
occurs because of changes made by the transaction.
When there is a user error, such as an invalid data format.
They claim that if there are two ongoing transactions on the same item, one will fail.
Why store the nextStartTime in the item? The nextStartTime is simply the start time of the next item, right? Seems like it'd be much easier to just pull the item as well as the next item to get the full picture at read-time. With a Query you can do this in one call, and so long as items are less than 2 KB in size it wouldn't even consume more RCUs than a get item would.
Simpler design, no cost for transactional writes, no need to do extensive testing on thread safety.
Related
I'm having trouble updating a single item many times at once. If I try to update an item with new attributes many times like so:
UpdateExpression: 'SET attribute.#uniqueId = :newAttribute'
not all of the updates go through. I tried sending 20 updates with unique ids and this resulted in only 15 new attributes. This also occurs in my local dynamodb instance. I assume that the updates are somehow overwriting each other in a "last update wins" scenario but I'm not sure. How can I solve this?
DynamoDB is eventually consistent on update, so "race conditions" are possible. If you want more strict logic in writes, take a look at transactions
Items are not locked during a transaction. DynamoDB transactions
provide serializable isolation. If an item is modified outside of a
transaction while the transaction is in progress, the transaction is
canceled and an exception is thrown with details about which item or
items caused the exception.
Your observation is very interesting, and contradicts observations made in the past in Are DynamoDB "set" values CDRTs? and Concurrent updates in DynamoDB, are there any guarantees? - in those issues people observed that concurrent writes to different set items or to different top-level attributes seem to not get overwritten. Neither case is exactly the same as what you tested (nested attributes), though, so it's not a definitive proof there was something wrong with your test, but it's still surprising.
Presentations made in the past by the DynamoDB developers suggested that in DynamoDB writes happen on a single node (the designated "leader" of the partition), and that this node can serialize the concurrent writes. This serialization is needed to allow conditional updates, counter increments, etc., to work safely with concurrent writes. Presumably, the same serialization could have also allowed multiple sub-attributes to be modified concurrently safely. If it doesn't, it might mean that this serialization is deliberately disabled for certain updates, perhaps all unconditional updates (without a ConditionExpression). This is very surprising, and should have been documented by Amazon...
In Google Spanner, commit timestamps are generated by the server and based on "TrueTime" as discussed in https://cloud.google.com/spanner/docs/commit-timestamp. This page also states that timestamps are not guarnateed to be unique, so multiple independent writers can generate timestamps that are exactly the same.
On the documentation of consistency guarantees, it is stated that In addition if one transaction completes before another transaction starts to commit, the system guarantees that clients can never see a state that includes the effect of the second transaction but not the first.
What I'm trying to understand is the combination of
Multiple concurrent transactions committing "at the same time" resulting in the same commit timestamp (where the commit timestamp forms part of a key for the table)
A reader observing new rows being entered into above table
Under these circumstances, is it possible that a reader can observe some but not all of the rows that will (eventually) be stored with the exact same timestamp? Or put differently, if searching for all rows up to a known exact timestamp, and with rows are being inserted with that timestamp, is it possible that the query first returns some of the results, but when executed again returns more?
The context of this is an attempt to model a stream of events ordered by time in an append only manner - I need to be able to keep what is effectively a cursor to a particular point in time (point in the stream of events) and need to know whether or not having observed events at time T means you can never get more events again at exactly time T.
Spanner is externally consistent, meaning that any reader will only be able to read the results of completed transactions...
Along with all externally consistent DB's, it is not possible for a reader outside of a transaction to be able to read the 'pending state' of another transaction. So a reader at time T will only be able to see transactions that have been committed before time T.
Multiple simultaneous insert/update transactions at commit time T (which would affect different rows, otherwise they could not be simultaneous) would not be seen by the reader at time T, but both would be seen by a reader at T+1
I ... need to know whether or not having observed events at time T means you can never get more events again at exactly time T.
Yes - ish. Rephrasing slightly as this is nuanced:
Having read events up to and including time T means you will never get any more events occurring with time equal to or before time T
But remember that the commit timestamp column is a simple TIMESTAMP column where any value can be stored -- it is the application that requests that the value stored is the commit timestamp, and there is nothing at the DB level to stop the application storing any value it likes...
As always with Spanner, it is the application which has to enforce/maintain the data integrity.
Scenario: We have a Dynamo DB table supporting Optimistic Locking with Version Number. Two concurrent threads are trying to save two different entries with the same primary key value to that Table.
Question: Will ConditionalCheckFailedException be thrown for the latter save action?
Yes, the second thread which tries to insert the same data would throw ConditionalCheckFailedException.
com.amazonaws.services.dynamodbv2.model.ConditionalCheckFailedException
As soon as the item is saved in database, the subsequent updates should have the version matching with the value on DynamoDB table (i.e. server side value).
save — For a new item, the DynamoDBMapper assigns an initial version
number 1. If you retrieve an item, update one or more of its
properties and attempt to save the changes, the save operation
succeeds only if the version number on the client-side and the
server-side match. The DynamoDBMapper increments the version number
automatically.
We had a similar use case in past but in our case, multiple threads reading first from the dynamoDB and then trying to update the values.
So finally there will be change in version by the time they read and they try to update the document and if you don't read the latest value from the DynamoDB then intermediate update will be lost(which is known as update loss issue refer aws-docs for more info).
I am not sure, if you have this use-case or not but if you have simply 2 threads trying to update the value and then if one of them get different version while their request reached to DynamoDB then you will get ConditionalCheckFailedException exception.
More info about this error can be found here http://grepcode.com/file/repo1.maven.org/maven2/com.michelboudreau/alternator/0.10.0/com/amazonaws/services/dynamodb/model/ConditionalCheckFailedException.java
I'm not sure about proper design of an approach.
We use optimistic locking using long incremented version placed on every entity. Each update of such entity is executed via compare-and-swap algorithm which just succeed or fail depending on whether some other client updates entity in the meantime or not. Classic optimistic locking as e.g. hibernate do.
We also need to adopt re-trying approach. We use http based storage (etcd) and it can happen that some update request is just timeouted.
And here it's the problem. How to combine optimistic locking and re-try. Here is the specific issue I'm facing.
Let say I have an entity having version=1 and I'm trying to update it. Next version is obviously 2. My client than executes conditional update. It's successfully executed only when the version in persistence is 1 and it's atomically updated to version=2. So far, so good.
Now, let say that a response for the update request does not arrive. It's impossible to say if it succeeded or not at this moment. The only thing I can do now is to re-try the update again. In memory entity still contains version=1 intending to update value to 2.
The real problem arise now. What if the second update fails because a version in persistence is 2 and not 1?
There is two possible reasons:
first request indeed caused the update - the operation was successful but the response got lost or my client timeout, whatever. It just did not arrived but it passed
some other client performed the update concurrently on the background
Now I can't say what is true. Did my client update the entity or some other client did? Did the operation passed or failed?
Current approach we use just compares persisted entity and the entity in main memory. Either as java equal or json content equality. If they are equal, the update methods is declared as successful. I'm not satisfied with the algorithm as it's not both cheap and reasonable for me.
Another possible approach is to do not use long version but timestamp instead. Every client generates own timestamp within the update operation in the meaning that potential concurrent client would generate other in high probability. The problem for me is the probability, especially when two concurrent updates would come from same machine.
Is there any other solution?
You can fake transactions in etcd by using a two-step protocol.
Algorithm for updating:
First phase: record the update to etcd
add an "update-lock" node with a fairly small TTL. If it exists, wait until it disappears and try again.
add a watchdog to your code. You MUST abort if performing the next steps takes longer than the lock's TTL (or if you fail to refresh it).
add a "update-plan" node with [old,new] values. Its structure is up to you, but you need to ensure that the old values are copied while you hold the lock.
add a "committed-update" node. At this point you have "atomically" updated the data.
Second phase: perform the actual update
read the "planned-update" node and apply the changes it describes.
If a change fails, verify that the new value is present.
If it's not, you have a major problem. Bail out.
delete the committed-update node
delete the update-plan node
delete the update-lock node
If you want to read consistent data:
While there is no committed-update node, your data are OK.
Otherwise, wait for it to get deleted.
Whenever committed-update is present but update-lock is not, initiate recovery.
Transaction recovery, if you find an update-plan node without a lock:
Get the update-lock.
if there is no committed-update node, delete the plan and release the lock.
Otherwise, continue at "Second phase", above.
IMHO, as etcd is built upon HTTP which is inherently an unsecure protocol, it will be very hard to have a bullet proof solution.
Classical SQL databases use connected protocols, transactions and journalisation to allow users to make sure that a transaction as a whole will be either fully committed or fully rollbacked, even in worst case of power outage in the middle of the operation.
So if 2 operations depend on each other (money transfert from one bank account to the other) you can make sure that either both are ok or none, and you can simply implement in the database a journal of "operations" with their status to be able to later see if a particuliar one was passed by consulting the journal, even if you were disconnected in the middle of the commit.
But I simply cannot imagine such a solution for etcd. So unless someone else finds a better way, you are left with two options
use a classical SQL database in the backend, using etcd (or equivalent) as a simple cache
accept the weaknesses of the protocol
BTW, I do not think that timestamp in lieue of long version number will strengthen the system, because in high load, the probability that two client transaction use same timestamp increases. Maybe you could try to add a unique id (client id or just technical uuid) to your fields, and when version is n+1 just compare the UUID that increased it : if it is yours, the transaction passed if not id did not.
But the really worse problem would arise if at the moment you can read the version, it is not at n+1 but already at n+2. If UUID is yours, you are sure your transaction passed, but if it is not nobody can say.
I am currently developing an application for Azure Table Storage. In that application I have table which will have relatively few inserts (a couple of thousand/day) and the primary key of these entities will be used in another table, which will have billions of rows.
Therefore I am looking for a way to use an auto-incremented integer, instead of GUID, as primary key in the small table (since it will save lots of storage and scalability of the inserts is not really an issue).
There've been some discussions on the topic, e.g. on http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/6b7d1ece-301b-44f1-85ab-eeb274349797.
However, since concurrency problems can be really hard to debug and spot, I am a bit uncomfortable with implementing this on own. My question is therefore if there is a well tested impelemntation of this?
For everyone who will find it in search, there is a better solution. Minimal time for table lock is 15 seconds - that's awful. Do not use it if you want to create a truly scalable solution. Use Etag!
Create one entity in table for ID (you can even name it as ID or whatever).
1) Read it.
2) Increment.
3) InsertOrUpdate WITH ETag specified (from the read query).
if last operation (InsertOrUpdate) succeeds, then you have a new, unique, auto-incremented ID. If it fails (exception with HttpStatusCode == 412), it means that some other client changed it. So, repeat again 1,2 and 3.
The usual time for Read+InsertOrUpdate is less than 200ms. My test utility with source on github.
See UniqueIdGenerator class by Josh Twist.
I haven't implemented this yet but am working on it ...
You could seed a queue with your next ids to use, then just pick them off the queue when you need them.
You need to keep a table to contain the value of the biggest number added to the queue. If you know you won't be using a ton of the integers, you could have a worker every so often wake up and make sure the queue still has integers in it. You could also have a used int queue the worker could check to keep an eye on usage.
You could also hook that worker up so if the queue was empty when your code needed an id (by chance) it could interupt the worker's nap to create more keys asap.
If that call failed you would need a way to (tell the worker you are going to do the work for them (lock), then do the workers work of getting the next id and unlock)
lock
get the last key created from the table
increment and save
unlock
then use the new value.
The solution I found that prevents duplicate ids and lets you autoincrement it is to
lock (lease) a blob and let that act as a logical gate.
Then read the value.
Write the incremented value
Release the lease
Use the value in your app/table
Then if your worker role were to crash during that process, then you would only have a missing ID in your store. IMHO that is better than duplicates.
Here is a code sample and more information on this approach from Steve Marx
If you really need to avoid guids, have you considered using something based on date/time and then leveraging partition keys to minimize the concurrency risk.
Your partition key could be by user, year, month, day, hour, etc and the row key could be the rest of the datetime at a small enough timespan to control concurrency.
Of course you have to ask yourself, at the price of date in Azure, if avoiding a Guid is really worth all of this extra effort (assuming a Guid will just work).