Objects gets overwritten in S3 while governance mode along with legal hold is enabled - amazon-web-services

I'm an absolute beginner in AWS and have been practising for 3 months from now.
Recently I was working on S3 and playing a bit with S3 object lock. So I enabled S3 object lock for a specific object with governance mode along with legal hold. Now when I tried to overwrite the object with the same file using the following CLI command:
aws s3 cp /Users/John/Desktop/112133.jpg s3://my-buck/112133.jpg
It succeeded interestingly and I checked in the console that the new file is uploaded with Latest Version on it. Now I read this in AWS docs that:
Bypassing governance mode doesn't affect an object version's legal
hold status. If an object version has a legal hold enabled, the legal
hold remains in force and prevents requests to overwrite or delete the
object version.
Now my question is how it get overwritten if this CLI command is used to overwrite a file? I tried also in the console to re uplaod the same file but it also worked.
Moreover I uploaded another file and enabled ojbect lock with compliance mode and it also get overwritten. But deletion doesn't work for both cases as expected.
Did I understand something wrong about the whole S3 ojbect lock thing? Any help will be appreciated.

To quote the Object Lock documentation:
Object Lock works only in versioned buckets, and retention periods and
legal holds apply to individual object versions. When you lock an
object version, Amazon S3 stores the lock information in the metadata
for that object version. Placing a retention period or legal hold on
an object protects only the version specified in the request. It
doesn't prevent new versions of the object from being created.

Related

S3 object lock behaviour after unlocked

From the Object Lock docs:
When you lock an object version, Amazon S3 stores the lock information in the metadata for that object version. Placing a retention period or legal hold on an object protects only the version specified in the request. It doesn't prevent new versions of the object from being created. If you put an object into a bucket that has the same key name as an existing, protected object, Amazon S3 creates a new version of that object, stores it in the bucket as requested, and reports the request as completed successfully. The existing, protected version of the object remains locked according to its retention configuration.
Assuming my bucket does not apply retention period by default, and I have a newly created S3 object with a legal hold. I overwrite it with another file twice. Will only the original version be protected, and all subsequent uploads be squashed into one version?
After I disable the legal hold,
Are my versions still maintained?
Will all further uploads overwrite the latest version?
If I delete the object, are all versions deleted? Or only the latest version?
After some testing,
Since versioning is already on, versions are still maintained
Subsequent uploads will still have their own versions
Only the latest version, unless you specify a version id

AWS S3 Storage's Object Lock settings not working

I have created an AWS S3 bucket with Object Lock settings for Compliance Mode. While I upload a file in the bucket (And the in the File settings I can see that the Object Lock is enabled in compliance mode), I was able to delete the file. I am not sure, as per the AWS documentation, even the root user cannot delete the file with Compliance Mode Object Lock.
Please help if I am misunderstood.
Important
Object locks apply to individual object versions only.
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html
Take a look at How Do I See the Versions of an S3 Object? and switch your console view to "show" object versions. You should find that you didn't actually delete the locked object version.
What you did when you "deleted" the object was create a delete marker.
A delete marker is a placeholder (marker) for a versioned object that was named in a simple DELETE request. Because the object was in a versioning-enabled bucket, the object was not deleted. The delete marker, however, makes Amazon S3 behave as if it had been deleted.
https://docs.aws.amazon.com/AmazonS3/latest/dev/DeleteMarker.html
With the console in the "hide" versions mode, delete requests are "simple DELETE requests" as mentioned above.
With the console in the "show" versions mode, delete operations you attempt are, instead, on specific versions of the object, and you should find that you are unable to delete any versions with object locks.
You'll also find that you can apparently overwrite an object with a new upload, but again you can't actually do that, because uploading an object with the same key in a versioned bucket (and enabling versioning is mandatory for object lock to work) doesn't overwrite the object -- it just creates a newer version of the object, leaving older versions intact.
When the top (newest, current) version of an object is a delete marker, the object disappears from the console and isn't included in ListObjects requests sent to the bucket via the API, but does appear in ListObjectVersions API requests. The "show/hide" setting is only applicable to your personal console view, it doesn't change actual bucket behavior.
The timestamps on object versions can't be altered, so locking an object version not only prevents deletion of the object contents, it also preserves a record of when that object was originally created. "Overwriting" an object creates a new version with a new timestamp, and the timestamps on the versions prove what content existed in the bucket at any given point in time.

Google cloud object persists after deletion

I'm using Chrome (vs Cloud SDK / command line) to repeatedly replace a file in a bucket. Dragging / dropping a file to overwrite the existing one, and / or deleting it first and putting it back (changed).
At a certain point the file stops updating and remains in a persistent state, even if I literally rm -r its parent folder.
i.e., I could have /bucket/css/file.css and rm -r /bucket/css and the file will still be available to the public.
From your second answer, it seems that your bucket has the option of “Object Versioning” enabled.
When Object Versioning is enabled for a bucket, Cloud Storage creates an archived version of an object each time the live version of the object is overwritten or deleted.
To verify that “Object Versioning” is enabled on your bucket you can use the following command:
gsutil versioning get gs://[BUCKET_NAME]
The response looks like the following if Object Versioning is enabled:
gs://[BUCKET_NAME]: Enabled
However, according to the official documentation there is no limit to the number of older versions of an object you will create if you continue to upload to the same object in a versioning-enabled bucket.
Having said that, I tried to reproduce your case in my own bucket. The steps I followed are:
1.Enable Object Versioning for my bucket
2.Upload a file in the bucket with the name “example.png”, using the GCP Console.
3.Drag and drop another file with the same name (“example.png”), but different content.
4.Check option “Replace existing object”
5.Check if the file has been updated. It had.
6.Repeated the process 50 times (since you said you had 40 archived versions of your file) by uploading the different files one after the other, every time overriding the previous one. Each time I uploaded a different content file, a new archived version of that file was created. Each time file updated accordingly without any problems.
Please review the steps I followed and let me know if there is any additional action from your side.
Since you are able to delete the files via gsutil command, deletion is working fine and you have all permissions required. Would you be able to clean-up the cookies of your web browser, and try deleting it again? You can also try to use incognito window mode to check if its working.
Furthermore, if Object Versioning is on, you can disable it and try deleting the object again.Note that object deletion can not be undone and once you delete the object it will be completely removed.
Additionally, a good practice suggested along with Object Versioning is to create an Object Lifecycle rule for the bucket that would delete all the objects that have been stored for more than a specific amount of time. You can use this as a workaround for deleting either live or archived versions of your object (if Object Versioning is actually enabled) and to accomplish it, you can follow this link.
Generally, you can review Deleting data best practices here.
Note that, according to Cloud Storage Object Limits a single particular object can only be updated or overwritten up to once per second. For more information, check here.
I used the gsutil to delete it and it worked... temporarily. It seems there were like 40 cached versions of the file with hash tag ids.
At some point it stops updating / deleting the file. :(
gsutil rm -r gs://bucket/path/to/folder/

How to rollback to previous version in Amazon S3 bucket?

I upload folders/files by:
aws s3 cp files s3://my_bucket/
aws s3 cp folder s3://my_bucket/ --recursive
Is there a way to return/rollback to previous version?
Like git revert or something similar?
Here the is test file that I uploaded 4 times.
How to get to previous version (make it the "Latest version")
For example make this "Jan 17, 2018 12:48:13" or "Jan 17, 2018 12:24:30"
to become the "Latest version" not in gui but by using command line?
Here is how to get that done:
If you are using cli,
https://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html
Get the object with the version you want.
Then perform a put object for the downloaded object.
https://docs.aws.amazon.com/cli/latest/reference/s3api/put-object.html
Your old S3 object will be the latest object now.
AWS S3 object is immutable and you can only put and delete. Rename is GET and PUT of the same object with a different name.
Hope it helps.
No. However, to protect against this in the future, you can enable versioning on your bucket and even configure the bucket to prevent automatic overwrites and deletes.
To enable versioning on your bucket, visit the Properties tab in the bucket and turn it on. After you have done so, copies or versions of each item within the bucket will contain version meta data and you will be able to retrieve older versions of the objects you have uploaded.
Once you have enabled versioning, you will not be able to turn it off.
EDIT (Updating my answer for your updated question):
You can't version your objects in this fashion. You are providing each object a unique Key, so S3 is treating it as a new object. You are going to need to use the same Key for each object PUTS to use versioning correctly. The only way to get this to work would be to GETS all of the objects from the bucket and find the most current date in the Key programmatically.
EDIT 2:
https://docs.aws.amazon.com/AmazonS3/latest/dev/RestoringPreviousVersions.html
To restore previous versions you can:
One of the value propositions of versioning is the ability to retrieve
previous versions of an object. There are two approaches to doing so:
Copy a previous version of the object into the same bucket The copied
object becomes the current version of that object and all object
versions are preserved.
Permanently delete the current version of the object When you delete
the current object version, you, in effect, turn the previous version
into the current version of that object.
I wasn't able to get answer I was looking to get for this question. I figured out myself by going to aws s3 console and would like to share here.
So, the quickest way is to simply navigate to:
--AWS Console -> to s3 console -> the bucket -> the s3 object
You will see the following:
At this point you can simpyl navigate to all your object
versions by clicking at the "Versions" and pick (download or move)
whichever version of the object you are interested in
S3 allows you to enable versioning for your bucket. If you have versioning on, you should be able to find previous versions back. If not, you are out of luck.
See the following page for more information: https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html

If an updated object in S3 serves as a lambda trigger, is there an inherent race condition?

If I update an object in an S3 Bucket, and trigger on that S3 PUT event as my Lambda trigger, is there a chance that the Lambda could operate on the older version of that object given S3’s eventual consistency model?
I’m having a devil of a time parsing out an authoritative answer either way...
Yes, there is a possibility that a blind GET of an object could fetch a former version.
There are at least two solutions that come to mind.
Weak: the notification event data contains the etag of the newly-uploaded object. If the object you fetch doesn't have this same etag in its response headers, then you know it isn't the intended object.
Strong: enable versioning on the bucket. The event data then contains the object versionId. When you download the object from S3, specify this exact version in the request. The consistency model is not as well documented when you overwrite an object and then download it with a specific version-id, so it is possible that this might result in an occasional 404 -- in which case, you almost certainly just spared yourself from fetching the old object -- but you can at least be confident that S3 will never give you a version other than the one explicitly specified.
If you weren't already using versioning on the bucket, you'll want to consider whether to keep old versions around, or whether to create a lifecycle policy to purge them... but one brilliantly-engineered feature about versioning is that the parts of your code that were written without awareness of versioning should still function correctly with versioning enabled -- if you send non-versioning-aware requests to S3, it still does exactly the right thing... for example, if you delete an object without specifying a version-id and later try to GET the object without specifying a version-id, S3 will correctly respond with a 404, even though the "deleted" version is actually still in the bucket.
How does the file get there in the first place? I'm asking, because if you could reverse the order, it'd solve your issue as you put your file in s3 via a lambda that before overwriting the file, can first get the existing version from the bucket and do whatever you need.