Create fault tolerance example with Dynamodb streams - amazon-web-services

I have been looking at DynamoDB to create something close to a transaction. I was watching this video presentation: https://www.youtube.com/watch?v=KmHGrONoif4 in which the speaker shows around the 30 minute mark ways to make dynamodb operation close to ACID compliant as can be. He shows the best concept is to use dynamodb streams, but doesn't show a demo or an example. I have a very simple scenario I am look at and that is I have one Table called USERS. Each user has a list of friends. If two users no longer wish to be friends they must be removed from both of the user's entities (I can't afford for one friend to be deleted from one entity, and due to a crash for example, the second user entities friend attribute is not updated causing inconsistent data). I was wondering if someone could provide some simple walk-through oh of how to accomplish something like this to see how it all works? If code could be provided that would be great to see how it works.
Cheers!

Here is the transaction library that he is referring: https://github.com/awslabs/dynamodb-transactions
You can read through the design: https://github.com/awslabs/dynamodb-transactions/blob/master/DESIGN.md
Here is the Kinesis client library:
http://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html
When you're writing to DynamoDB, you can get an output stream with all the operations that happen on the table. That stream can be consumed and processed by the Kinesis Client Library.
In your case, have your client remove it from the first user, then from the second user. In the Kinesis Client Library when you are consuming the stream and see a user removed, look at who he was friends with and go check/remove if needed - if needed the removal should probably done through the same means. It's not truly a transaction, and relies on the fact that KCL guarantees that the records from the streams will be processed.
To add to this confusion, KCL uses Dynamo to store where in the stream is at when processing and to checkpoint processed records.

You should try and minimize the need for transactions, which is a nice concept in a small scale, but can't really scale once you become very successful and need to support millions and billions of records.
If you are thinking in a NoSQL mind set, you can consider using a slightly different data model. One simple example is to use Global Secondary Index on a single table on the "friend-with" attribute. When you add a single record with a pair of friends, both the record and the index will be updated in a single action. Both table and index will be updated in a single action, when you delete the friendship record.
If you choose to use the Updates Stream mechanism or the Global Secondary Index one, you should take into consideration the "eventual consistency" case of the distributed system. The consistency can be achieved within milli-seconds, but it can also take longer. You should analyze the business implications and the technical measures you can take to solve it. For example, you can verify the existence of both records (main table as well as the index, if you found it in the index), before you present it to the user.

Related

Are DynamoDB conditional writes transactional?

I'm having trouble wrapping my head around the dichotomy of DDB providing Condition Writes but also being eventually consistent. These two truths seem to be at odds with each other.
In the classic scenario, user Bob updates key A and sets the value to "FOO". User Alice reads from a node that hasn't received the update yet, and so it gets the original value "BAR" for the same key.
If Bob and Alice write to different nodes on the cluster without condition checks, it's possible to have a conflict where Alice and Bob wrote to the same key concurrently and DDB does not know which update should be the latest. This conflict has to be resolved by the client on next read.
But what about when condition write are used?
User Bob sends their update for A as "FOO" if the existing value for A is "BAR".
User Alice sends their update for A as "BAZ" if the existing value for A is "BAR".
Locally each node can check to see if their node has the original "BAR" value and go through with the update. But the only way to know the true state of A across the cluster is to make a strongly consistent read across the cluster first. This strongly consistent read must be blocking for either Alice or Bob, or they could both make a strongly consistent read at the same time.
So here is where I'm getting confused about the nature of DDBs condition writes. It seems to me that either:
Condition writes are only evaluated locally. Merge conflicts can still occur.
Condition writes are evaluated cross cluster.
If it is #2, the only way I see that working is if:
A lock is created for the key.
A strongly consistent read is made.
Let's say it's #2. Now where does this leave Bob's update? The update was made to node 2 and sent to node 1 and we have a majority quorum. But to make those updates available to Alice when they do their own conditional write, those updates need to be flushed from WAL. So in a conditional write are the updates always flushed? Are writes always flushed in general?
There have been other questions like this here on SO but the answers were a repeat of, or a link to, the AWS documentation about this. The AWS documentation doesn't really explain this (or i missed it).
DynamoDB conditional writes are "transactional" writes but how they're done is not public information & is perhaps proprietary intellectual property.
DynamoDB developers are the only ones with this information.
Your issue is that you're looking at this from a node perspective - I have gone through every mention of node anywhere in DynamoDB documentation & it's just mentions of Node.js or DAX nodes not database nodes.
While there can be outdated reads - yes, that would indicate some form of node - there are no database nodes per such when doing conditional writes.
User Bob sends their update for A as "FOO" if the existing value for A is "BAR". User Alice sends their update for A as "BAZ" if the existing value for A is "BAR".
Whoever's request gets there first is the one that goes through first.
The next request will just fail, meaning you now need to make a new read request to obtain the latest value to then proceed with the 2nd later write.
The Amazon DynamoDB developer guide shows this very clearly.
Note that there are no nodes, replicas etc. - there is only 1 reference to the DynamoDB table:
Condition writes are probably evaluated cross-cluster & a strongly consistent read is probably made but Amazon has not made this information public.
Ermiya Eskandary is correct that the exact details of DynamoDB's implementation aren't public knowledge, and also subject to change in the future while preserving the documented guarantees of the API. Nevertheless, various documents and especially video presentations that the Amazon developers did in the past, made it relatively clear how this works under the hood - at least in broad strokes, and I'll try to explain my understanding here:
Note that this might not be 100% accurate, and I don't have any inside knowledge about DynamoDB.
DynamoDB keeps three copies of each item.
One of the nodes holding a copy of a specific item is designated the leader for this item (there isn't a single "leader" - it can be a different leader per item). As far as I know, we have no details on which protocol is used to choose this leader (of course, if nodes go down, the leader choice changes).
A write to an item is started on the leader, who serializes writes to the same item. Note how DynamoDB conditional updates can only read and update the same item, so the same node (the leader) can read and write the item with just a local lock. After the leader evaluates the codintion and decides to write, it also sends an update to the two other nodes - returning success to the user only after two of the three nodes successfully wrote the data (to ensure durablity).
As you probably know, DyanamoDB reads have two options consistent and eventually-consistent: An eventually-consistent read reads from one of the three replicas at random, and might not yet see the result of a successful write (if the write wrote two copies, but not yet the third one). A consistent read reads from the leader, so is guaranteed to read the previously-written data.
Finally you asked about DynamoDB's newer and more expensive "Transaction" support. This is a completely different feature. DynamoDB "Transactions" are all about reads and writes to multiple items in the same request. As I explained above, the older conditional-updates feature only allows a read-modify-write operation to involve a single item at a time, and therefore has a simpler implementation - where a single node (the leader) can serialize concurrent writes and make the decisions without a complex distributed algorithm (however, a complex distributed algorithm is needed to pick the leader).

I do not see any records in my dynamodb stream

I have problems implementing dynamodbstreams. We want to get records of changes right at the time the dynamodb table is changed.
We've used the java example from https://docs.aws.amazon.com/en_en/amazondynamodb/latest/developerguide/Streams.LowLevel.Walkthrough.html and translated it for our c++ project. Instead of ShardIteratorType.TRIM_HORIZON we use ShardIteratorType.LATEST). Also I am currently testing with an existing table and do not know how many records to expect.
Most of the time when iterating over the shards I retrieve from Aws::DynamoDBStreams::DynamoDBStreamsClient and the Aws::DynamoDBStreams::Model::DescribeStreamRequest I do not see any records. For testing I change entries in the dynamodb table through the aws console. But sometimes (and I do not know why) there are records and it works as expected.
I am sure that I misunderstand the concept of streams and especially of shards and records. My thinking is that I need to find a way to find the most recent shard and to find the most recent data in that shard.
Isn't this what ShardIteratorType.LATEST would do? How can I find the most recent data in my stream?
I appreciate all of your thoughts and am curious about what happens to my first stackoverflow post ever.
Best
David
How can I find the most recent data in my stream?
How would you define the most recent data? Last 10 entries? Last entry? Or data that is not yet in the shard? The question may sound silly but the answer makes a difference.
The option - LATEST - that you are using is going to set the head of the iterator right after the last entry which means that unless new data arrives after the iterator has been created, there will be nothing to read.
If by the most recent data you mean some records that are already in the shard then you can't use LATEST. The easy option is to use TRIM_HORIZON.
Or even easier would be to subscribe lambda function to that stream that will automatically be invoked whenever a new record is put into the stream (with the record being passed to that lambda function as payload), which might be preferable if you need to handle events in near-real time.

AWS Data Structure and Stack Suggestion for highly filterable data

Firstly, let me know if I should place this in a different Community. It is programming related but less than I would prefer.
I am creating a mobile app based which I intend to base on AWS App Sync unless I can determine it is a poor fit.
I want to store a fairly large set of data, say a half million records.
From these records, I need to be able to grab all entries based on a tag and page them from the larger set.
An example of this data would be:
{
"name":"Product123",
"tags":[
{
"name":"1880",
"type":"year",
"value":7092
},
{
"name":"f",
"type":"gender",
"value":4120692
}
]
}
Various objects may or may not have a specific tag but may have up to 500 tags or more (the seed of initial data has 130 tags). My filter would ignore them if they did not match but return them if they did.
In reading about Query vs Scan on DyanmoDB, I feel like my current data structure would require mostly scanning and be in-efficient. Efficiency is only a real restriction due to cost.
With cost in mind, I will focus on the cost per user to access this data in filtered sets. Say 100,000 users for now each filtering and paging data many times a day.
Your concept of tags doesn't sound too different from the concept of Cognito User Pools' groups with AppSync (docs) - authentication based on groups will only return items allowed for groups that the user making the request is in. Cognito's default group limit is 25 per user pool, so while convenient out of the box, it wouldn't itself help you much. Instead, it's interesting just because it's similar conceptually, and can give you insight by looking at how it works internally.
If you go into the AppSync console and set up a request mapping template for groups auth, you'll see that it uses a scan and the contains operation. Doing something similar would probably be your best bet here, if you really want to use Dynamo. If you find that prohibitively costly, you could use a Lambda data source, which allows you to use any data store, if you have one in mind that's a little more flexible for this type of action.

BigQuery tabledata:list output into a bigquery table

I know there is a way to place the results of a query into a table; there is a way to copy a whole table into another table; and there is a way to list a table piecemeal (tabledata:list using startIndex, maxResults and pageToken).
However, what I want to do is go over an existing table with tabledata:list and output the results piecemeal into other tables. I want to use this as an efficient way to shard a table.
I cannot find a reference to such a functionality, or any workaround to it for that matter.
Important to realize: Tabledata.List API is not part of BQL (BigQuery SQL) but rather BigQuery API that you can use in client of your choice.
That said, the logic you outlined in your question can be implemented in many ways, below is an example (high level steps):
Calling Tabledata.List within the loop using pageToken for next iteration or for exiting loop.
In each iteration, process response from Tabledata.List, extract actual data and insert into destination table using streaming data with Tabledata.InsertAll API. You can also have inner loop to go thru rows extracted in given iteration and define which one to go to which table/shard.
This is very generic logic and particular implementation depends on client you use.
Hope this helps
For what you describe, I'd suggest you use the batch version of Cloud Dataflow:
https://cloud.google.com/dataflow/
Dataflow already supports BigQuery tables as sources and sinks, and will keep all data within Google's network. This approach also scales to arbitrarily large tables.
TableData.list-ing your entire table might work fine for small tables, but network overhead aside, it is definitely not recommended for anything of moderate size.

Auto-increment on Azure Table Storage

I am currently developing an application for Azure Table Storage. In that application I have table which will have relatively few inserts (a couple of thousand/day) and the primary key of these entities will be used in another table, which will have billions of rows.
Therefore I am looking for a way to use an auto-incremented integer, instead of GUID, as primary key in the small table (since it will save lots of storage and scalability of the inserts is not really an issue).
There've been some discussions on the topic, e.g. on http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/6b7d1ece-301b-44f1-85ab-eeb274349797.
However, since concurrency problems can be really hard to debug and spot, I am a bit uncomfortable with implementing this on own. My question is therefore if there is a well tested impelemntation of this?
For everyone who will find it in search, there is a better solution. Minimal time for table lock is 15 seconds - that's awful. Do not use it if you want to create a truly scalable solution. Use Etag!
Create one entity in table for ID (you can even name it as ID or whatever).
1) Read it.
2) Increment.
3) InsertOrUpdate WITH ETag specified (from the read query).
if last operation (InsertOrUpdate) succeeds, then you have a new, unique, auto-incremented ID. If it fails (exception with HttpStatusCode == 412), it means that some other client changed it. So, repeat again 1,2 and 3.
The usual time for Read+InsertOrUpdate is less than 200ms. My test utility with source on github.
See UniqueIdGenerator class by Josh Twist.
I haven't implemented this yet but am working on it ...
You could seed a queue with your next ids to use, then just pick them off the queue when you need them.
You need to keep a table to contain the value of the biggest number added to the queue. If you know you won't be using a ton of the integers, you could have a worker every so often wake up and make sure the queue still has integers in it. You could also have a used int queue the worker could check to keep an eye on usage.
You could also hook that worker up so if the queue was empty when your code needed an id (by chance) it could interupt the worker's nap to create more keys asap.
If that call failed you would need a way to (tell the worker you are going to do the work for them (lock), then do the workers work of getting the next id and unlock)
lock
get the last key created from the table
increment and save
unlock
then use the new value.
The solution I found that prevents duplicate ids and lets you autoincrement it is to
lock (lease) a blob and let that act as a logical gate.
Then read the value.
Write the incremented value
Release the lease
Use the value in your app/table
Then if your worker role were to crash during that process, then you would only have a missing ID in your store. IMHO that is better than duplicates.
Here is a code sample and more information on this approach from Steve Marx
If you really need to avoid guids, have you considered using something based on date/time and then leveraging partition keys to minimize the concurrency risk.
Your partition key could be by user, year, month, day, hour, etc and the row key could be the rest of the datetime at a small enough timespan to control concurrency.
Of course you have to ask yourself, at the price of date in Azure, if avoiding a Guid is really worth all of this extra effort (assuming a Guid will just work).