Auto-increment on Azure Table Storage - concurrency

I am currently developing an application for Azure Table Storage. In that application I have table which will have relatively few inserts (a couple of thousand/day) and the primary key of these entities will be used in another table, which will have billions of rows.
Therefore I am looking for a way to use an auto-incremented integer, instead of GUID, as primary key in the small table (since it will save lots of storage and scalability of the inserts is not really an issue).
There've been some discussions on the topic, e.g. on http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/6b7d1ece-301b-44f1-85ab-eeb274349797.
However, since concurrency problems can be really hard to debug and spot, I am a bit uncomfortable with implementing this on own. My question is therefore if there is a well tested impelemntation of this?

For everyone who will find it in search, there is a better solution. Minimal time for table lock is 15 seconds - that's awful. Do not use it if you want to create a truly scalable solution. Use Etag!
Create one entity in table for ID (you can even name it as ID or whatever).
1) Read it.
2) Increment.
3) InsertOrUpdate WITH ETag specified (from the read query).
if last operation (InsertOrUpdate) succeeds, then you have a new, unique, auto-incremented ID. If it fails (exception with HttpStatusCode == 412), it means that some other client changed it. So, repeat again 1,2 and 3.
The usual time for Read+InsertOrUpdate is less than 200ms. My test utility with source on github.

See UniqueIdGenerator class by Josh Twist.

I haven't implemented this yet but am working on it ...
You could seed a queue with your next ids to use, then just pick them off the queue when you need them.
You need to keep a table to contain the value of the biggest number added to the queue. If you know you won't be using a ton of the integers, you could have a worker every so often wake up and make sure the queue still has integers in it. You could also have a used int queue the worker could check to keep an eye on usage.
You could also hook that worker up so if the queue was empty when your code needed an id (by chance) it could interupt the worker's nap to create more keys asap.
If that call failed you would need a way to (tell the worker you are going to do the work for them (lock), then do the workers work of getting the next id and unlock)
lock
get the last key created from the table
increment and save
unlock
then use the new value.

The solution I found that prevents duplicate ids and lets you autoincrement it is to
lock (lease) a blob and let that act as a logical gate.
Then read the value.
Write the incremented value
Release the lease
Use the value in your app/table
Then if your worker role were to crash during that process, then you would only have a missing ID in your store. IMHO that is better than duplicates.
Here is a code sample and more information on this approach from Steve Marx

If you really need to avoid guids, have you considered using something based on date/time and then leveraging partition keys to minimize the concurrency risk.
Your partition key could be by user, year, month, day, hour, etc and the row key could be the rest of the datetime at a small enough timespan to control concurrency.
Of course you have to ask yourself, at the price of date in Azure, if avoiding a Guid is really worth all of this extra effort (assuming a Guid will just work).

Related

What's the cheapest way to store an auto increment indexed list of values in AWS?

I have a DynamoDB-based web application that uses DynamoDB to store my large JSON objects and perform simple CRUD operations on them via a web API. I would like to add a new table that acts like a categorization of these values. The user should be able to select from a selection box which category the object belongs to. If a desirable category does not exist, the user should be able to create a new category specifying a name which will be available to other objects in the future.
It is critical to the application that every one of these categories be given a integer ID that increments starting the first at 1. These numbers that are auto generated will turn into reproducible serial numbers for back end reports that will not use the user-visible text name.
So I would like to have a simple API available from the web fronted that allows me to:
A) GET /category : produces { int : string, ... } of all categories mapped to an ID
B) PUSH /category : accepts string and stores the string to the next integer
Here are some ideas for how to handle this kind of project.
Store it in DynamoDB with integer indexes. This leaves has some benefits but it leaves a lot to be desired. Firstly, there's no auto incrementing ID in DynamoDB, but I could definitely get the state of the table, create a new ID, and store the result. This might have issues with consistency and race conditions but there's probably a way to achieve this safely. It might, however, be a big anti pattern to use DynamoDB this way.
Store it in DynamoDB as one object in a table with some random index. Just store the mapping as a JSON object. This really forgets the notion of tables in DynamoDB and uses it as a simple file. It might also run into some issues with race conditions.
Use AWS ElasticCache to have a Redis key value store. This might be "the right" decision but the downside is that ElasticCache is an always on DB offering where you pay per hour. For a low-traffic web site like mine I'd be paying minumum $12/mo I think and I would really like for this to be pay per access/update due to the low volume. I'm not sure there's an auto increment feature for Redis built in the way I'd need it. But it's pretty trivial to make a trasaction that gets the length of the table, adds one, and stores a new value. Race conditions are easily avoid with this solution.
Use a SQL database like AWS Aurora or MYSQL. Well this has the same upsides as Redis, but it's also more overkill than Redis is, and also it costs a lot more and it's still always on.
Run my own in memory web service or MongoDB etc... still you're paying for constant containers running. Writing my own thing is obviously silly but I'm sure there are services that match this issue perfectly but they'd all require a constant container to run.
Is there a food way to just store a simple list, or integer mapping like this that doesn't cost a constant monthly cost? Is there a better way to do this with DynamoDB?
Store the maxCounterValue as an item in DyanamoDB.
For the PUSH /category, perform the following:
Get the current maxCounterValue.
TransactWrite:
Put the category name and id into a new item with id = maxCounterValue + 1.
Update the maxCounterValue +1, add a ConditionExpression to check that maxCounterValue = :valueFromGetOperation.
If TransactWrite fails, start at 1 again, try X more times

Better method for querying DynamoDB table randomly?

I've included some links along with our approaches to other answers, which seem to be the most optimal on the web right now.
Our records need to be categorized (eg. "horror", "thriller", "tv"), and randomly accessible both in specific categories and across all/some categories. We generally need to access about 20 - 100 items at a time. We also have a smallish number of categories (less than 100).
We write to the database for uploading/removing content, although this is done in batches and does not need to be real time.
We have tried two different approaches, with two different data structures.
Approach 1
AWS DynamoDB - Pick a record/item randomly?
Help selecting nth record in query.
In short, using the category as a hash key, and a UUID as the sort key. Generate a random UUID, query Dynamo using greater than or less than, and limit to 1. This is even suggested by an AWS employee in the second link. (We've also tried increasing the limit to the number of items we need, but this increases the probability of the query failing the first time around).
Issues with this approach:
First query can fail if it is greater than/less than any of the UUIDs
Querying on any specific category will cause throttling at scale (Small number of partitions)
We've also considered adding a suffix to each category to artificially increase the number of partitions we have, as pointed out in the following link.
AWS Database Blog
Choosing the Right DynamoDB Partition Key
Approach 2
Amazon Web Services: How do we get random item from the dynamoDb's table?
Doing something similar to this, where we concatenate the category with a sequential number, and use this as the hash key. e.g. horror-000001.
By knowing the number of records in each category, we're able to perform random queries across our entire data set, while also avoiding hot partitions/keys.
Issues with this approach
We need a secondary data structure to manage the sequential counts across each category
Writing (especially deleting) is significantly more complex, although this doesn't need to happen in real time.
Conclusion
Both approaches solve our main use case of random queries on category/categories, but the cons they offer are really deterring us from using them. We're leaning more towards approach #1 using suffixes to solve the hot partitioning issue, although we would need the additional retry logic for failed queries.
Is there a better way of approaching this problem? Specifically looking for solutions capable of scaling well (No scan), without requiring extra resources be implemented. #1 fits the bill, but needing to manage suffixes and failed attempts really deters us from using it, especially when it is being called inside a lambda (billed for time used).
Thanks!
Follow Up
After more research and testing, my team has decided to move towards MySQL hosted on RDS for these tables. We learned that this is one of the few use cases were DynamoDB does not fit, and requires rewriting your use case to fit the DB (Bad).
We felt that the extra complexity required to integrate random sampling on DynamoDB wasn't worth it, and we were unable to come up with any comparable solutions. We are, however, sticking with DynamoDB for our tables that do not need random accessibility due to the price and response times.
For anyone wondering why we chose MySQL, it was largely due to the Nodejs library available, great online resources (which DynamoDB definitely lacks), easy integration via RDS with our Lambdas, and the option to migrate to Amazons Aurora database.
We also looked at PostgreSQL, but we weren't as happy with the client library or admin tools, and we believe that MySQL will suit our needs for these tables.
If anybody has anything else they'd like to add or a specific question please leave a comment or send me a message!
This was too long for a comment, and I guess it's pretty much a full fledged answer now.
Approach 2
I've found that my typical time to get a single item from dynamodb to a host in the same region is <10ms. As long as you're okay with at most 1-2 extra calls, you can quite easily implement approach 2.
If you use a keys only GSI where the category is your hash key and the primary key of the table is your range key, you can quickly find the largest numbered single item within a category.
When you add a new item, find the largest number for that category from the GSI and then write the new item to the table with sequence number n+1.
When you delete, find the item with the largest sequence number for that category from the GSI, overwrite the item you are deleting, and then delete the now duplicated item from its position at the highest sequence number.
To randomly get an item, query the GSI to find the highest numbered item in the category, and then randomly pick a number since you now know the valid range.
Approach 1
I'm not sure exactly what you mean when you say "without requiring extra resources to be implemented". If you're okay with using a managed resource (no dev work to implement), you can also make Approach 1 work by putting a DAX cluster in front of your dynamodb table. Then you can query to your heart's content without really worrying about hot partitions. (Though the caching layer means that new/deleted items won't be reflected right away.)

Create fault tolerance example with Dynamodb streams

I have been looking at DynamoDB to create something close to a transaction. I was watching this video presentation: https://www.youtube.com/watch?v=KmHGrONoif4 in which the speaker shows around the 30 minute mark ways to make dynamodb operation close to ACID compliant as can be. He shows the best concept is to use dynamodb streams, but doesn't show a demo or an example. I have a very simple scenario I am look at and that is I have one Table called USERS. Each user has a list of friends. If two users no longer wish to be friends they must be removed from both of the user's entities (I can't afford for one friend to be deleted from one entity, and due to a crash for example, the second user entities friend attribute is not updated causing inconsistent data). I was wondering if someone could provide some simple walk-through oh of how to accomplish something like this to see how it all works? If code could be provided that would be great to see how it works.
Cheers!
Here is the transaction library that he is referring: https://github.com/awslabs/dynamodb-transactions
You can read through the design: https://github.com/awslabs/dynamodb-transactions/blob/master/DESIGN.md
Here is the Kinesis client library:
http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
When you're writing to DynamoDB, you can get an output stream with all the operations that happen on the table. That stream can be consumed and processed by the Kinesis Client Library.
In your case, have your client remove it from the first user, then from the second user. In the Kinesis Client Library when you are consuming the stream and see a user removed, look at who he was friends with and go check/remove if needed - if needed the removal should probably done through the same means. It's not truly a transaction, and relies on the fact that KCL guarantees that the records from the streams will be processed.
To add to this confusion, KCL uses Dynamo to store where in the stream is at when processing and to checkpoint processed records.
You should try and minimize the need for transactions, which is a nice concept in a small scale, but can't really scale once you become very successful and need to support millions and billions of records.
If you are thinking in a NoSQL mind set, you can consider using a slightly different data model. One simple example is to use Global Secondary Index on a single table on the "friend-with" attribute. When you add a single record with a pair of friends, both the record and the index will be updated in a single action. Both table and index will be updated in a single action, when you delete the friendship record.
If you choose to use the Updates Stream mechanism or the Global Secondary Index one, you should take into consideration the "eventual consistency" case of the distributed system. The consistency can be achieved within milli-seconds, but it can also take longer. You should analyze the business implications and the technical measures you can take to solve it. For example, you can verify the existence of both records (main table as well as the index, if you found it in the index), before you present it to the user.

Optimistic locking and re-try

I'm not sure about proper design of an approach.
We use optimistic locking using long incremented version placed on every entity. Each update of such entity is executed via compare-and-swap algorithm which just succeed or fail depending on whether some other client updates entity in the meantime or not. Classic optimistic locking as e.g. hibernate do.
We also need to adopt re-trying approach. We use http based storage (etcd) and it can happen that some update request is just timeouted.
And here it's the problem. How to combine optimistic locking and re-try. Here is the specific issue I'm facing.
Let say I have an entity having version=1 and I'm trying to update it. Next version is obviously 2. My client than executes conditional update. It's successfully executed only when the version in persistence is 1 and it's atomically updated to version=2. So far, so good.
Now, let say that a response for the update request does not arrive. It's impossible to say if it succeeded or not at this moment. The only thing I can do now is to re-try the update again. In memory entity still contains version=1 intending to update value to 2.
The real problem arise now. What if the second update fails because a version in persistence is 2 and not 1?
There is two possible reasons:
first request indeed caused the update - the operation was successful but the response got lost or my client timeout, whatever. It just did not arrived but it passed
some other client performed the update concurrently on the background
Now I can't say what is true. Did my client update the entity or some other client did? Did the operation passed or failed?
Current approach we use just compares persisted entity and the entity in main memory. Either as java equal or json content equality. If they are equal, the update methods is declared as successful. I'm not satisfied with the algorithm as it's not both cheap and reasonable for me.
Another possible approach is to do not use long version but timestamp instead. Every client generates own timestamp within the update operation in the meaning that potential concurrent client would generate other in high probability. The problem for me is the probability, especially when two concurrent updates would come from same machine.
Is there any other solution?
You can fake transactions in etcd by using a two-step protocol.
Algorithm for updating:
First phase: record the update to etcd
add an "update-lock" node with a fairly small TTL. If it exists, wait until it disappears and try again.
add a watchdog to your code. You MUST abort if performing the next steps takes longer than the lock's TTL (or if you fail to refresh it).
add a "update-plan" node with [old,new] values. Its structure is up to you, but you need to ensure that the old values are copied while you hold the lock.
add a "committed-update" node. At this point you have "atomically" updated the data.
Second phase: perform the actual update
read the "planned-update" node and apply the changes it describes.
If a change fails, verify that the new value is present.
If it's not, you have a major problem. Bail out.
delete the committed-update node
delete the update-plan node
delete the update-lock node
If you want to read consistent data:
While there is no committed-update node, your data are OK.
Otherwise, wait for it to get deleted.
Whenever committed-update is present but update-lock is not, initiate recovery.
Transaction recovery, if you find an update-plan node without a lock:
Get the update-lock.
if there is no committed-update node, delete the plan and release the lock.
Otherwise, continue at "Second phase", above.
IMHO, as etcd is built upon HTTP which is inherently an unsecure protocol, it will be very hard to have a bullet proof solution.
Classical SQL databases use connected protocols, transactions and journalisation to allow users to make sure that a transaction as a whole will be either fully committed or fully rollbacked, even in worst case of power outage in the middle of the operation.
So if 2 operations depend on each other (money transfert from one bank account to the other) you can make sure that either both are ok or none, and you can simply implement in the database a journal of "operations" with their status to be able to later see if a particuliar one was passed by consulting the journal, even if you were disconnected in the middle of the commit.
But I simply cannot imagine such a solution for etcd. So unless someone else finds a better way, you are left with two options
use a classical SQL database in the backend, using etcd (or equivalent) as a simple cache
accept the weaknesses of the protocol
BTW, I do not think that timestamp in lieue of long version number will strengthen the system, because in high load, the probability that two client transaction use same timestamp increases. Maybe you could try to add a unique id (client id or just technical uuid) to your fields, and when version is n+1 just compare the UUID that increased it : if it is yours, the transaction passed if not id did not.
But the really worse problem would arise if at the moment you can read the version, it is not at n+1 but already at n+2. If UUID is yours, you are sure your transaction passed, but if it is not nobody can say.

Web Service to return unique auto incremented human readable id number

I'm looking to create a simple web service that when polled returns a unique id. The ID has to be human readable (i.e. not a guid, probably in the form 000023) and is simply incremented by 1 each time its called.
Now I need to consider that it may be called by two different applications at the same time and I don't want it to return the same number to each application.
Is there another option than using a database to store the current number?
Surely this has been done before, can anyone point me at some source code if it is.
Thanks,
Neil
Use a critical section piece of code to control flow one at a time through a section of code. You can do this using the lock statement or by being slightly more hardcore and using a mutex directly. Doing this will ensure that you return a different number to each caller.
As for storing it, using a database is overkill for returning an auto incrementing number - although SQLServer and Oracle (and most likely others but i can't speak for them) both provide an auto incrementing keys feature, so you could have the webservice called, generate a new entry in the database table, return the key, and the caller can use that number as a key back to that record (if you are saving more data later after the initial call). This way you also let the database worry about the generation of unique numbers, you don't have to worry about the details of it - although this is not a good option if you don't already have a database.
The other option is to store it in a local file, although that would be expensive to read the file, increment the number, and write it back out, all within a critical section.
you can use a file.
Pseudocode:
if (!locked('counter.txt'))
counter = read('counter.txt')
else
wait
startAgain
lock('counter.txt')
counter++
print counter
write('counter.txt', counter)
unlock('counter.txt)