AWS S3 Eventual Consistency and read after write Consistency

AWS S3 Eventual Consistency and read after write Consistency - amazon-web-services

Help me to understand better these concepts that i can't grasp fully.
Talking about aws S3 consistecy models, i'll try to explain what i grasped.
Demistify or confirm me these claims please.
first of all
talking about "read after write" is related only to "new writings"/creation of objects that didn't exist before.
talking about "eventual consistency" is related to "modifying existing objects" (updating or deleting)
are these first concepts correct?
then,
eventual consistency: a "client" who accesses to a datum before this has been completely written on a node, can read an old version of the object because the writing can be still in progress e the object might not has been commetted.
This is a behavior universally tolerated in distributed systems where this type consistency is preferred to the other option of waiting for some sort of lock being removed when the object has been committed.
read after write consistency: the objects are immediately available to the client and the client will read the "real" version of the object, never an old version, and if i've understood well this is true only for new object.
If so, why these replication methods are so differen? and produce this differnet consistency?
The concept of "eventual consistency" is more natural to grasp, because you have to consider the "latency" to propagate the data to different nodes and a client might access during this time and getting no fresh data yet.
But why "read after write" should be immediate? to propagate a modification on an existing datum, or create a new datum, should have the same latency. I can't understood the difference.
Can you please tell me if my claims are correct, and explain in different way this concept.

talking about "read after write" is related only to "new writings"/creation of objects that didn't exist before.
Yes
talking about "eventual consistency" is related to "modifying existing objects" (updating or deleting)
Almost correct, but be aware of one caveat. Here is a quote from the documentation:
The caveat is that if you make a HEAD or GET request to a key name before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency.
Regarding to why they offer different consistency models, here is my understanding/speculation. (Note: the following content might be wrong since I've never worked for S3 and don't know the actual internal implementation of it.)
S3 is a distributed system, so it's very likely that S3 uses some internal caching service. Think of how CDN works, I think you can use the similar analogy here. In the case where you GET an object whose key is not in the cache yet, it's a cache miss! S3 will fetch the latest version of the requested object, save it into the cache, and return it back to you. This is the read-after-write model.
On the other hand, if you update an object that's already in the cache, then besides replicating your new object to other availability zones, S3 needs to do more work to update the existing data in the cache. Therefore, the propagation process will likely be longer. Instead of letting you wait on the request, S3 made the decision to return the existing data in the cache. This data might be an old version of this object. This concludes the eventual consistency.
As Phil Karlton said, there are only two hard things in Computer Science: cache invalidation and naming things. AWS has no good ways to fully get around with this, and has to make some compromises too.

Related

What is the best option to store a step functions' state

I have a step function which needs a rather large state (memory passed between its states) in order to do its job. This state was larger than the memory that can be passed by the step function engine. Searching online, one solution to this was using S3 buckets as the alternative. Now, I've reached another limitation. S3 is eventually consistent and as a result, I'm losing data from time to time. By that I mean the data read from the bucket is not the latest state.
My question is, does anyone know a better option/solution to keep the state for a step function?

Use a new S3 key every time you modify the state information, then there's no risk of stale data caused by S3's consistency model.

DynamoDB Eventually consistent reads vs Strongly consistent reads

I recently came to know about two read modes of DynamoDB. But I am not clear about when to choose what. Can anyone explain the trade-offs?

Basically, if you NEED to have the latest values, use a fully consistent read. You'll get the guaranteed current value.
If your app is okay with potentially outdated information (mere seconds or less out of date), then use eventually consistent reads.
Examples of fully-consistent:
Bank balance (Want to know the latest amount)
Location of a locomotive on a train network (Need absolute certainty to guarantee safety)
Stock trading (Need to know the latest price)
Use-cases for eventually consistent reads:
Number of Facebook friends (Does it matter if another was added in the last few seconds?)
Number of commuters who used a particular turnstile in the past 5 minutes (Not important if it is out by a few people)
Stock research (Doesn't matter if it's out by a few seconds)

Apart from the other answers shortly the reason for this read modes is:
Lets say you have table User in eu-west-1 region. Without you being aware there are multiple Availability Zones AWS handles in the background. Like replicates your data in case of failure etc..Basically there are copies of your tables and once you insert a table there needs to be multiple resources to be updated.
But now when you wanna read there might be a chance that you are reading from not-yet-updated table without being aware of. Usually takes under a second for dynamodb to update. This is why its called eventually consistent. It will eventually be consistent in a short amount of time :)
When making decision knowing this reasoning helps me to understand and design my use cases.

Is creating a new version of an object in AWS S3 eventually consistent or read-after-write consistent?

I see from Amazon's documentation that writing a new object to S3 is read-after-write consistent, but that update and delete operations are eventually consistent. I would guess that pushing a new version of an object with versioning turned on would be eventually consistent like an update, but I can't find any documentation to confirm. Does anyone know?
Edit: My question is regarding the behavior of a GET with or without an explicit version specified.
I'd really like read-after-write behavior on updates for my project, which I may be able to simulate doing inserts only, but it might be easier if pushing new versions of an object provided the desired behavior.

As you already know...
Q: What data consistency model does Amazon S3 employ?
Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.
— https://aws.amazon.com/s3/faqs/
...and that's about all there is, as far as official statements on the consistency model.
However, I would suggest that the remainder can be extrapolated with a reasonable degree of certainty from this, along with assumptions we can reasonably make, plus some additional general insights into the inner workings of S3.
For example, we know that S3 does not actually store the objects in a hierarchical structure, yet:
Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored lexicographically across multiple partitions in the index.
http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
This implies that S3 has at least two discrete major components, a backing store where the data is persisted, and an index of keys pointing to locations in the backing store. We also know that both of then are distributed across multiple availability zones and thus both of them are replicated.
The fact that the backing store is separate from the index is not a foregone conclusion until you remember that storage classes are selectable on a per-object basis, which almost necessarily means that the index and the data are stored separately.
From the fact that overwrite PUT operations are eventually-consistent, we can conclude that even in a non-versioned bucket, an overwrite is not in fact an overwrite of the backing store, but rather an overwrite of the index entry for that object's key, and an eventual freeing of the space in the backing store that's no longer referenced by the index.
The implication I see in these assertions is that the indexes are replicated and it's possible for a read-after-overwrite (or delete) to hit a replica of the index that does not yet reflect the most recent overwrite... but when a read encounters a "no such key" condition in its local index, the system pursues more resource-intensive path of interrogating the "master" index (whatever that may actually mean in the architecture of S3) to see if such an object really does exist, but the local index replica simply hasn't learned of it yet.
Since the first GET of a new object that has not replicated to the appropriate local index replica is almost certainly a rare occurrence, it is reasonable to expect that the architects of S3 made this allowance for a higher cost "discovery" operation to improve the user experience, when a node in the system believes this may be the condition it is encountering.
From all of this, I would suggest that the most likely behavior you would experience would be this:
GET without a versionId on a versioned object after an overwrite PUT would be eventually-consistent, since the node servicing the read request would not encounter the No Such Key condition, and would therefore not follow the theoretical higher-cost "discovery" model I speculated above.
GET with an explicit request for the newest versionId would be immediately consistent on an overwrite PUT, since the reading node would likely launch the high-cost strategy to obtain upstream confirmation of whether its index reflected all the most-current data, although of course the condition here would be No Such Version, rather than No Such Key.
I know speculation is not what you were hoping for, but absent documented confirmation or empirical (or maybe some really convincing anecdotal) evidence to the contrary, I suspect this is the closest we can come to drawing credible conclusions based on the publicly-available information about the S3 platform.

Specifying version id during get operation is always strongly consistent on versioning enabled objects.

I would not assume anything.
What I would do is capture the versionid (returned in x-amz-version-id header) from the PUT request and issue a GET (or even better HEAD) to ensure that the object was indeed persisted and is visible in S3.

Should I be concerned with bit flips on Amazon S3?

I've got some data that I want to save on Amazon S3. Some of this data is encrypted and some is compressed. Should I be worried about single bit flips? I know of the MD5 hash header that can be added. This (from my experience) will prevent flips in the most unreliable portion of the deal (network communication), however I'm still wondering if I need to guard against flips on disk?

I'm almost certain the answer is "no", but if you want to be extra paranoid you can precalculate the MD5 hash before uploading, compare that to the MD5 hash you get after upload, then when downloading calculate the MD5 hash of the downloaded data and compare it to your stored hash.
I'm not sure exactly what risk you're concerned about. At some point you have to defer the risk to somebody else. Does "corrupted data" fall under Amazon's Service Level Agreement? Presumably they know what the file hash is supposed to be, and if the hash of the data they're giving you doesn't match, then it's clearly their problem.
I suppose there are other approaches too:
Store your data with an FEC so that you can detect and correct N bit errors up to your choice of N.
Store your data more than once in Amazon S3, perhaps across their US and European data centers (I think there's a new one in Singapore coming online soon too), with RAID-like redundancy so you can recover your data if some number of sources disappear or become corrupted.
It really depends on just how valuable the data you're storing is to you, and how much risk you're willing to accept.

I see your question from two points of view, a theoretical and practical.
From a theoretical point of view, yes, you should be concerned - and not only about bit flipping, but about several other possible problems. In particular section 11.5 of the customer agreements says that Amazon
MAKE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, WHETHER EXPRESS, IMPLIED, STATUTORY OR OTHERWISE WITH RESPECT TO THE SERVICE OFFERINGS. (..omiss..) WE AND OUR LICENSORS DO NOT WARRANT THAT THE SERVICE OFFERINGS WILL FUNCTION AS DESCRIBED, WILL BE UNINTERRUPTED OR ERROR FREE, OR FREE OF HARMFUL COMPONENTS, OR THAT THE DATA YOU STORE WITHIN THE SERVICE OFFERINGS WILL BE SECURE OR NOT OTHERWISE LOST OR DAMAGED.
Now, in practice, I'd not be concerned. If your data will be lost, you'll blog about it and (although they might not face any legal action), their business will be pretty much over.
On the other hand, that depends on how much vital your data is. Suppose that you were rolling your own stuff in your own data center(s). How would you plan for disaster recovery there? If you says: I'd just keep two copies in two different racks, just use the same technique with Amazon, maybe keeping two copies in two different datacenters (since you wrote that you are not interested in how to protect against bit flips, I'm providing only a trivial example here)

Probably not: Amazon is using checksums to protect against bit flips, regularly combing through data at rest, ensuring that no bit flips have occurred. So, unless you have corruption in all instances of the data within the interval of integrity check loops you should be fine.
Internally, S3 uses MD5 checksums throughout the system to detect/protect against bitflips. When you PUT an object into S3, we compute the MD5 and store that value. When you GET an object we recompute the MD5 as we stream it back. If our stored MD5 doesn't match the value we compute as we're streaming the object back we'll return an error for the GET request. You can then retry the request.
We also continually loop through all data at rest, recomputing checksums and validating them against the MD5 we saved when we originally stored the object. This allows us to detect and repair bit flips that occur in data at rest. When we find a bit flip in data at rest, we repair it using the redundant data we store for each object.
You can also protect yourself against bitflips during transmission to and from S3 by providing an MD5 checksum when you PUT the object (we'll error if the data we received doesn't match the checksum) and by validating the MD5 when GET an object.
Source:
https://forums.aws.amazon.com/thread.jspa?threadID=38587

There are two ways of reading your question:
"Is Amazon S3 perfect?"
"How do I handle the case where Amazon S3 is not perfect?"
The answer to (1) is almost certainly "no". They might have lots of protection to get close, but there is still the possibility of failure.
That leaves (2). The fact is that devices fail, sometimes in obvious ways and other times in ways that appear to work but give an incorrect answer. To deal with this, many databases use a per-page CRC to ensure that a page read from disk is the same as the one that was written. This approach is also used in modern filesystems (for example ZFS, which can write multiple copies of a page, each with a CRC to handle raid controller failures. I have seen ZFS correct single bit errors from a disk by reading a second copy; disks are not perfect.)
In general you should have a check to verify that your system is operating is you expect. Using a hash function is a good approach. What approach you take when you detect a failure depends on your requirements. Storing multiple copies is probably the best approach (and certainly the easiest) because you can get protection from site failures, connectivity failures and even vendor failures (by choosing a second vendor) instead of just redundancy in the data itself by using FEC.

concurrency issue :reduce credit for account

hi we try to implement a process like when a user does something, his company's credit will be deducted accordingly.
But there is a concurrency issue when multiple users in one company participant in the process because the credit got deducted wrong.
Can anyone point a right direction for such issue?
thanks very much.

This is a classic problem that is entirely independent of the implementation language(s).
You have a shared resource that is maintaining a persistent data store. (This is typically a database, likely an RDBMS).
You also have a (business) process that uses and/or modifies the information maintained in the shared data store.
When this process can be performed concurrently by multiple actors, the issue of informational integrity arises.
The most common way to address this is to serialize access to the shared resources, so that the operation against the shared resources occur in sequence.
This serialization can happen at the actor level, or, at the shared resource itself, and can take many forms, such as queuing actions, or using messaging, or using transactions at the shared resource. Its here that considerations such as system type, application, and the platforms and systems that are used become important and determine the design of the overall system.
Take a look at this wikipedia article on db transactions, and then google your way to more technical content on this topic. You may also wish to take a look at messaging systems, and if you are feeling adventurous, also read up on software transactional memory.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js