S3 last-modified timestamp for eventually-consistent overwrite PUTs - amazon-web-services

The AWS S3 docs state that:
Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions.
http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
The timespan until full consistency is reached can vary. During this period GET requests may return the previous object or the udpated object.
My question is:
When is the last-modified timestamp updated? Is it updated immediately after the overwrite PUT succeeds but before full consistency is reached, or is it only updated after full consistency is achieved?
I suspect the former but I can't find any documentation which clearly states this.

The Last-Modified timestamp should match the Date value returned in the response headers from the successful PUT request.
To my knowledge, this is not explicitly documented, but it can be derived from what is documented.
When you overwrite an object, it's not the overwriting itself that may be delayed by the eventual consistency model -- it's the availability of the overwritten content at a given S3 node (S3 is replicated to multiple nodes within the S3 region).
The Last-Modified timestamp, like the rest of the metadata, is established at the time of object creation and immutable, thereafter.
It is, in fact, not the "modification" time of the object at all, it is the creation time of the object. The explanation may sound pedantic, but it is accurate in the strictest sense: S3 objects and their metadata cannot in fact be modified at all, they can only be overwritten. When you "overwrite" an object in S3, what you are actually doing is creating a new object, reusing the old object's key (path+file name). The availability of this new object at a given S3 node (replication) is what may be delayed by the eventual consistency model... not the actual creation of the new object that overwrites the old one... hence there would be no reason for Last-Modified to be impacted by the replication delay (assuming there is a replication delay -- eventual consistency can at times be indistinguishable from immediate consistency).

This is something S3 does that is absolutely terrible.
Basically in Linux you have the mtime which is the time the file was last modified on the filesystem. Any S3 client could gather the mtime and set the Last-Modified time on S3 so that it would maintain when things were actually last modified.
Instead, Amazon just does this based on the object creation and this is effectively a massive problem if you ever just want to use the data as data outside of the original application that put it there.
So if you download a file from S3, your client would likely set the modified time and if it was uploaded to s3 immediately as it was created then you would at least have a near correct timestamp. But the reality is that you might take a picture and it might not get from your phone through the app, through the stack and to S3 for days!
This is not even considering re-uploading the file to s3. Which would compound the problem, as you might re-upload it years later. S3 will just act like Last-Modified is years later when the file was not actually modified.
They really need to allow you to set it, but they remain ambiguous and over-documented in other areas to make this hard to figure out.
https://github.com/s3tools/s3cmd/issues/524

Related

Google Cloud Storage metadata updates

I have a bit of a two-part question regarding the nature of metadata update notifications in GCS. // For the mods: if I should split this into two, let me know and I will.
I have a bucket in Google Cloud Storage, with Pub/Sub notifications configured for object metadata changes. I routinely get doubled metadata updates, seemingly out of nowhere. What happens is that at one point, a Cloud Run container reads the object designated by the notification and does some things that result in
a) a new file being added.
b) an email being sent.
And this should be the end of it.
However, app. 10 minutes later, a second notification fires for the same object, with the metageneration incremented but no actual changes being evident in the notification object.
Strangely, the ETag seems to change minimally (CJ+2tfvk+egCEG0 -> CJ+2tfvk+egCEG4), but the CRC32C and MD5 checksums remain the same - this is correct in the sense that the object is not being written.
The question is twofold, then:
- What exactly constitutes an increment in the metageneration attribute, when no metadata is being set/updated?
- How can the ETag change if the underlying data does not, as shown by the checksums (I guess the documentation does say "that they will change whenever the underlying data changes"[1], which does not strictly mean they cannot change otherwise).
1: https://cloud.google.com/storage/docs/hashes-etags#_ETags
As commented by #Brandon Yarbrough If the metageneration number increases, the most likely cause is an explicit call from somewhere unexpected to update the metadata in some fashion, and a way to verify that no extra update calls are being executed is by enabling Stackdriver or bucket access logs.
Regarding the ETag changes, the ETag documentation on Cloud Storage states that
Users should make no assumptions about those ETags except that they will change whenever the underlying data changes.
This indicates that the only scenario that is guaranteed that the ETag will be changed is on the data change, however, other events may trigger an ETag change as well, so you should not use ETags as a reference for file changes.

caveat of the read-after-write consistency for PUTS of new objects in a S3 bucket

From https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html :
Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key name (to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write.
I'm not sure if I understand the caveat correctly. Before creating the object: ok, I haven't yet created an object with the key K, therefore no object with the key K exists; I make a GET request to K... what does my request result to according to the explanation above?
I'm confused because the explanation tells about the eventual consistency for read-after-write. But there is no write so far.
Update 2020-12-02 This whole discussion is now outdated. Amazon S3 provides strong read-after-write consistency for PUTs and DELETEs of objects in your Amazon S3 bucket in all AWS Regions.
Update I rewrote the answer after reading a comment in this blog post.
I believe this caveat is talking about this scenario
client 1: GET key_a --> this could return an object even this request was sent earlier.
client 2: PUT key_a
This could be possible in case the request of client 1 reached later than the PUT request to a node.
This situation happens when you have a file to upload, but that file might already exist. So rather than overwrite the existing file, you do the following:
Try to GET the file. It doesn't exist, so you get a 404 with No such key
PUT the file.
Try to GET the file immediately afterward (for whatever reason).
In this sequence, step #3 may or may not return the file. Eventually you can retrieve the file, but how long that takes from the time of upload depends on the internals of S3 (I could speculate on why that happens, but it would only be speculation).

If an updated object in S3 serves as a lambda trigger, is there an inherent race condition?

If I update an object in an S3 Bucket, and trigger on that S3 PUT event as my Lambda trigger, is there a chance that the Lambda could operate on the older version of that object given S3’s eventual consistency model?
I’m having a devil of a time parsing out an authoritative answer either way...
Yes, there is a possibility that a blind GET of an object could fetch a former version.
There are at least two solutions that come to mind.
Weak: the notification event data contains the etag of the newly-uploaded object. If the object you fetch doesn't have this same etag in its response headers, then you know it isn't the intended object.
Strong: enable versioning on the bucket. The event data then contains the object versionId. When you download the object from S3, specify this exact version in the request. The consistency model is not as well documented when you overwrite an object and then download it with a specific version-id, so it is possible that this might result in an occasional 404 -- in which case, you almost certainly just spared yourself from fetching the old object -- but you can at least be confident that S3 will never give you a version other than the one explicitly specified.
If you weren't already using versioning on the bucket, you'll want to consider whether to keep old versions around, or whether to create a lifecycle policy to purge them... but one brilliantly-engineered feature about versioning is that the parts of your code that were written without awareness of versioning should still function correctly with versioning enabled -- if you send non-versioning-aware requests to S3, it still does exactly the right thing... for example, if you delete an object without specifying a version-id and later try to GET the object without specifying a version-id, S3 will correctly respond with a 404, even though the "deleted" version is actually still in the bucket.
How does the file get there in the first place? I'm asking, because if you could reverse the order, it'd solve your issue as you put your file in s3 via a lambda that before overwriting the file, can first get the existing version from the bucket and do whatever you need.

Is AWS S3 read guaranteed to return a newly created object?

I've been reading the docs regarding read-after-write consistency with AWS S3 but I'm still unsure about this.
If I write an object to S3 and after getting a successful response from my write operation, I immediately attempt to read it, is the read operation guaranteed to return the object?
In other words, is it possible that the read operation will fail because it can't find the object? Because the read happened too soon after the write?
I'm only talking about new PUTs here, not updates to existing objects.
Yes guaranteed to return the object (only for new objects) with one caveat:
As per AWS documentation:
Amazon S3 provides read-after-write consistency for PUTS of new
objects in your S3 bucket in all regions with one caveat. The caveat
is that if you make a HEAD or GET request to the key name (to find if
the object exists) before creating the object, Amazon S3 provides
eventual consistency for read-after-write.
Amazon S3 offers eventual consistency for overwrite PUTS and DELETES
in all regions.
EDIT: credits to #Michael - sqlbot, more on HEAD (or) GET caveat:
If you send a GET or HEAD before the object exists, such as to check whether there's an object there before you upload, then the upload is not immediately consistent for read requests even after the upload is complete, because S3 has already made the only immediately consistent internal query it's going to make for that object, discovering, authoritatively, that there's no such key. The object creation becomes eventually consistent, since the creation has to "overwrite" the previous lookup that found nothing.
Based on following table provided in the link, "consistent reads" will never be stale.
Above provided link has nice example regarding how "read-after-write consistency" & "eventual consistency" works.
I would like to add this caution note to this answer to make things more clear:
Amazon S3 achieves high availability by replicating data across multiple servers within Amazon's data centers. If a PUT request is successful, your data is safely stored. However, information about the changes must replicate across Amazon S3, which can take some time, and so you might observe the following behaviors:
A process writes a new object to Amazon S3 and immediately lists keys
within its bucket. Until the change is fully propagated, the object
might not appear in the list.

Are writes to Amazon S3 atomic (all-or-nothing)?

I have a large number of files that I am reading and writing to S3.
I am just wondering if I need to code for the case where a file is "half written" e.g. the S3 PUT / Write only "half" worked.
Or are writes to S3 all-or-nothing?
I know there is a read-write eventual consistency issue which (I think) is largely a separate issue.
See S3 PUT documentation:
Amazon S3 never adds partial objects; if you receive a success response, Amazon S3 added the entire object to the bucket.
For all regions except US Standard (us-east-1) you get read-after-write-consistency. This means that if you get an HTTP 200 OK for your PUT, you can read the object right away.
If your request is dropped in the middle, you would not get and HTTP 200 and your object would not be written at all.
UPDATE: All regions now support read-after-write consistency (thanks #jeff-loughridge):
https://aws.amazon.com/about-aws/whats-new/2015/08/amazon-s3-introduces-new-usability-enhancements/
From the docs:
Updates to a single key are atomic. For example, if you PUT to an existing key from one thread and perform a GET on the same key from a second thread concurrently, you will get either the old data or the new data, but never partial or corrupt data.
This answer is somewhat similar to the existing ones, but it stresses on the fact that not only there is no risk of leaving a partially-written object behind, but also that a reader will never be put at risk of seeing (reading) a partially-written object.