Amazon S3 - What does eventual consistency mean in regard to delete operations? - amazon-web-services

I visited Amazon's website and I read the available information in regard to eventual consistency however it's still not completely clear to me.
What I am still not sure about is the behavior of S3 in the timeframe between an execution of update / delete and the moment when the consistency is eventually achieved.
For example, what will happen if I delete object A and subsequently execute a HEAD operation for object A multiple times?
I suppose I will eventually start getting a RESOURCE_NOT_FOUND error constantly at some point of time (when the deletion becomes consistent) but prior to that moment what should I expect to get?
I see two options.
1) Every HEAD operation succeeds up to a point in time and after that every HEAD operation constantly fails with RESOURCE_NOT_FOUND.
2) Each HEAD operation succeeds or fails "randomly" until some moment in which the eventual consistency is achieved.
Could someone clarify which of the two should be the expected behavior?
Many thanks.

I see two options.
It could be either of these. Neither one is necessarily the "expected" behavior. Eventually, requests would all return 404.
S3 is a large scale system, distributed across multiple availability zones in the region, so each request could hit one of several possible endpoints, each if which could reflect the bucket's state at a slightly different point in time. As long as they are all past the point where the object is deleted, they should continuously return 404, but the state of bucket index replication isn't exposed.

Related

AWS S3 Eventual Consistency and read after write Consistency

Help me to understand better these concepts that i can't grasp fully.
Talking about aws S3 consistecy models, i'll try to explain what i grasped.
Demistify or confirm me these claims please.
first of all
talking about "read after write" is related only to "new writings"/creation of objects that didn't exist before.
talking about "eventual consistency" is related to "modifying existing objects" (updating or deleting)
are these first concepts correct?
then,
eventual consistency: a "client" who accesses to a datum before this has been completely written on a node, can read an old version of the object because the writing can be still in progress e the object might not has been commetted.
This is a behavior universally tolerated in distributed systems where this type consistency is preferred to the other option of waiting for some sort of lock being removed when the object has been committed.
read after write consistency: the objects are immediately available to the client and the client will read the "real" version of the object, never an old version, and if i've understood well this is true only for new object.
If so, why these replication methods are so differen? and produce this differnet consistency?
The concept of "eventual consistency" is more natural to grasp, because you have to consider the "latency" to propagate the data to different nodes and a client might access during this time and getting no fresh data yet.
But why "read after write" should be immediate? to propagate a modification on an existing datum, or create a new datum, should have the same latency. I can't understood the difference.
Can you please tell me if my claims are correct, and explain in different way this concept.
talking about "read after write" is related only to "new writings"/creation of objects that didn't exist before.
Yes
talking about "eventual consistency" is related to "modifying existing objects" (updating or deleting)
Almost correct, but be aware of one caveat. Here is a quote from the documentation:
The caveat is that if you make a HEAD or GET request to a key name before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency.
Regarding to why they offer different consistency models, here is my understanding/speculation. (Note: the following content might be wrong since I've never worked for S3 and don't know the actual internal implementation of it.)
S3 is a distributed system, so it's very likely that S3 uses some internal caching service. Think of how CDN works, I think you can use the similar analogy here. In the case where you GET an object whose key is not in the cache yet, it's a cache miss! S3 will fetch the latest version of the requested object, save it into the cache, and return it back to you. This is the read-after-write model.
On the other hand, if you update an object that's already in the cache, then besides replicating your new object to other availability zones, S3 needs to do more work to update the existing data in the cache. Therefore, the propagation process will likely be longer. Instead of letting you wait on the request, S3 made the decision to return the existing data in the cache. This data might be an old version of this object. This concludes the eventual consistency.
As Phil Karlton said, there are only two hard things in Computer Science: cache invalidation and naming things. AWS has no good ways to fully get around with this, and has to make some compromises too.

DynamoDB Eventually consistent reads vs Strongly consistent reads

I recently came to know about two read modes of DynamoDB. But I am not clear about when to choose what. Can anyone explain the trade-offs?
Basically, if you NEED to have the latest values, use a fully consistent read. You'll get the guaranteed current value.
If your app is okay with potentially outdated information (mere seconds or less out of date), then use eventually consistent reads.
Examples of fully-consistent:
Bank balance (Want to know the latest amount)
Location of a locomotive on a train network (Need absolute certainty to guarantee safety)
Stock trading (Need to know the latest price)
Use-cases for eventually consistent reads:
Number of Facebook friends (Does it matter if another was added in the last few seconds?)
Number of commuters who used a particular turnstile in the past 5 minutes (Not important if it is out by a few people)
Stock research (Doesn't matter if it's out by a few seconds)
Apart from the other answers shortly the reason for this read modes is:
Lets say you have table User in eu-west-1 region. Without you being aware there are multiple Availability Zones AWS handles in the background. Like replicates your data in case of failure etc..Basically there are copies of your tables and once you insert a table there needs to be multiple resources to be updated.
But now when you wanna read there might be a chance that you are reading from not-yet-updated table without being aware of. Usually takes under a second for dynamodb to update. This is why its called eventually consistent. It will eventually be consistent in a short amount of time :)
When making decision knowing this reasoning helps me to understand and design my use cases.

Is creating a new version of an object in AWS S3 eventually consistent or read-after-write consistent?

I see from Amazon's documentation that writing a new object to S3 is read-after-write consistent, but that update and delete operations are eventually consistent. I would guess that pushing a new version of an object with versioning turned on would be eventually consistent like an update, but I can't find any documentation to confirm. Does anyone know?
Edit: My question is regarding the behavior of a GET with or without an explicit version specified.
I'd really like read-after-write behavior on updates for my project, which I may be able to simulate doing inserts only, but it might be easier if pushing new versions of an object provided the desired behavior.
As you already know...
Q: What data consistency model does Amazon S3 employ?
Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.
— https://aws.amazon.com/s3/faqs/
...and that's about all there is, as far as official statements on the consistency model.
However, I would suggest that the remainder can be extrapolated with a reasonable degree of certainty from this, along with assumptions we can reasonably make, plus some additional general insights into the inner workings of S3.
For example, we know that S3 does not actually store the objects in a hierarchical structure, yet:
Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored lexicographically across multiple partitions in the index.
http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
This implies that S3 has at least two discrete major components, a backing store where the data is persisted, and an index of keys pointing to locations in the backing store. We also know that both of then are distributed across multiple availability zones and thus both of them are replicated.
The fact that the backing store is separate from the index is not a foregone conclusion until you remember that storage classes are selectable on a per-object basis, which almost necessarily means that the index and the data are stored separately.
From the fact that overwrite PUT operations are eventually-consistent, we can conclude that even in a non-versioned bucket, an overwrite is not in fact an overwrite of the backing store, but rather an overwrite of the index entry for that object's key, and an eventual freeing of the space in the backing store that's no longer referenced by the index.
The implication I see in these assertions is that the indexes are replicated and it's possible for a read-after-overwrite (or delete) to hit a replica of the index that does not yet reflect the most recent overwrite... but when a read encounters a "no such key" condition in its local index, the system pursues more resource-intensive path of interrogating the "master" index (whatever that may actually mean in the architecture of S3) to see if such an object really does exist, but the local index replica simply hasn't learned of it yet.
Since the first GET of a new object that has not replicated to the appropriate local index replica is almost certainly a rare occurrence, it is reasonable to expect that the architects of S3 made this allowance for a higher cost "discovery" operation to improve the user experience, when a node in the system believes this may be the condition it is encountering.
From all of this, I would suggest that the most likely behavior you would experience would be this:
GET without a versionId on a versioned object after an overwrite PUT would be eventually-consistent, since the node servicing the read request would not encounter the No Such Key condition, and would therefore not follow the theoretical higher-cost "discovery" model I speculated above.
GET with an explicit request for the newest versionId would be immediately consistent on an overwrite PUT, since the reading node would likely launch the high-cost strategy to obtain upstream confirmation of whether its index reflected all the most-current data, although of course the condition here would be No Such Version, rather than No Such Key.
I know speculation is not what you were hoping for, but absent documented confirmation or empirical (or maybe some really convincing anecdotal) evidence to the contrary, I suspect this is the closest we can come to drawing credible conclusions based on the publicly-available information about the S3 platform.
Specifying version id during get operation is always strongly consistent on versioning enabled objects.
I would not assume anything.
What I would do is capture the versionid (returned in x-amz-version-id header) from the PUT request and issue a GET (or even better HEAD) to ensure that the object was indeed persisted and is visible in S3.

When a ConcurrencyException occurs, what gets written?

We use RavenDB in our production environment. It stores millions of documents, and gets updated pretty much constantly during the day.
We have two boxes load-balanced using a round-robin strategy which replicate to one another.
Every week or so, we get a ConcurrencyException from Raven. I understand that this basically means that one of the servers was told to insert or update the same document within in short timeframe - it's kind of like a conflict exception, except occurring on the same server instead of two replicating servers.
What happens when this error occurs? Can I assume that at least one of the writes succeeded? Can I predict which one? Is there anything I can do to make these exceptions less likely?
ConcurrencyException means that on a single server, you have two writes to the same document at the same instant.
That lead to:
One write is accepted.
One write is rejected (with concurrency exception).

REST search interface and the idempotency of GET

In order to stick with the REST concepts, such as safe operations, idempotency, etc., how can one implement a complex search operation involving multiple parameters?
I have seen Google's implementation, and that is creative. What is an option, other than that?
The idempotent requirement is what is tripping me up, as the operation will definitely not return the same results for the same criteria, say searching for customers named "Smith" will not return the same set every time, because more "Smith" customer are added all the time. My instinct is to use GET for this, but for a true search feature, the result would not seem to be idempotent, and would need to be marked as non-cacheable due to its fluid result set.
To put it another way, the basics behind idempotency is that the GET operation doesn't affect the results of the operation. That is, the GET can safely be repeated with no ill side effects.
However, an idempotent request has nothing to do with the representation of the resource.
Two contrived examples:
GET /current-time
GET /current-weather/90210
As should be obvious, these resources will change over time, some resources change more rapidly than others. But the GET operation itself is not germane in affecting the actual resource.
Contrast to:
GET /next-counter
This is, obviously I hope, not an idempotent request. The request itself is changing the resource.
Also, there's nothing that says an idempotent operation has NO side effects. Clearly, many system log accesses and requests, including GETs. Therefore, when you do GET /resource, the logs will change as a result of that GET. That kind of side affect doesn't make the GET not idempotent. The fundamental premise is the affect on the resource itself.
But what about, say:
GET /logs
If the logs register every request, and the GET is returning the logs in their current state, does that mean that the GET in this case is not idempotent? Yup! Does it really matter? Nope. Not for this one edge case. Just the nature of the game.
What about:
GET /random-number
If you're using a pseudo-random number generator, most of those feed upon themselves. Starting with a seed and feeding their results back in to themselves to get the next number. So, using a GET here may not be idempotent. But is it? How do you know how the random number is generated. It could be a white noise source. And why do you care? If the resource is simply a random number, you really don't know if the operation is changing it or not.
But just because there may be exceptions to the guidelines, doesn't necessarily invalidate the concepts behind those guidelines.
Resources change, thats a simple fact of life. The representation of a resource does not have to be universal, or consistent across requests, or consistent across users. Literally, the representation of a resource is what GET delivers, and it is up to the application, using who knows what criteria to determine that representation for each request. Idempotent requests are very nice because they work well with the rest of the REST model -- things like caching and content negotiation.
Most resources don't change quickly, and relying on specific transactions, using non-idempotent verbs, offers a more predictable and consistent interface for clients. When a method is supposed to be idempotent, clients will be quite surprised when it turns out to not be the case. But in the end, its up to the application and its documented interface.
GET is safe and idempotent when properly implemented. That means:
It will cause no client-visible side-effects on the server side
When directed at the same URI, it causes the same server-side function to be executed each time, regardless of how many times it is issued, or when
What is not said above is that GET to the same URI always returns the same data.
GET causes the same server-side function to be executed each time, and that function is typically, "return a representation of the requested resource". If that resource has changed since the last GET, the client will get the latest data. The function which the server executes is the source of the idempotency, not the data which it uses as input (the state of the resource being requested).
If a timestamp is used in the URI to make sure that the server data being requested is the same each time, that just means that something which is already idempotent (the function implementing GET) will act upon the same data, thereby guaranteeing the same result each time.
It would be idempotent for the same dataset. You could achieve this with a timestamp filter.