Google cloud object persists after deletion - google-cloud-platform

I'm using Chrome (vs Cloud SDK / command line) to repeatedly replace a file in a bucket. Dragging / dropping a file to overwrite the existing one, and / or deleting it first and putting it back (changed).
At a certain point the file stops updating and remains in a persistent state, even if I literally rm -r its parent folder.
i.e., I could have /bucket/css/file.css and rm -r /bucket/css and the file will still be available to the public.

From your second answer, it seems that your bucket has the option of “Object Versioning” enabled.
When Object Versioning is enabled for a bucket, Cloud Storage creates an archived version of an object each time the live version of the object is overwritten or deleted.
To verify that “Object Versioning” is enabled on your bucket you can use the following command:
gsutil versioning get gs://[BUCKET_NAME]
The response looks like the following if Object Versioning is enabled:
gs://[BUCKET_NAME]: Enabled
However, according to the official documentation there is no limit to the number of older versions of an object you will create if you continue to upload to the same object in a versioning-enabled bucket.
Having said that, I tried to reproduce your case in my own bucket. The steps I followed are:
1.Enable Object Versioning for my bucket
2.Upload a file in the bucket with the name “example.png”, using the GCP Console.
3.Drag and drop another file with the same name (“example.png”), but different content.
4.Check option “Replace existing object”
5.Check if the file has been updated. It had.
6.Repeated the process 50 times (since you said you had 40 archived versions of your file) by uploading the different files one after the other, every time overriding the previous one. Each time I uploaded a different content file, a new archived version of that file was created. Each time file updated accordingly without any problems.
Please review the steps I followed and let me know if there is any additional action from your side.
Since you are able to delete the files via gsutil command, deletion is working fine and you have all permissions required. Would you be able to clean-up the cookies of your web browser, and try deleting it again? You can also try to use incognito window mode to check if its working.
Furthermore, if Object Versioning is on, you can disable it and try deleting the object again.Note that object deletion can not be undone and once you delete the object it will be completely removed.
Additionally, a good practice suggested along with Object Versioning is to create an Object Lifecycle rule for the bucket that would delete all the objects that have been stored for more than a specific amount of time. You can use this as a workaround for deleting either live or archived versions of your object (if Object Versioning is actually enabled) and to accomplish it, you can follow this link.
Generally, you can review Deleting data best practices here.
Note that, according to Cloud Storage Object Limits a single particular object can only be updated or overwritten up to once per second. For more information, check here.

I used the gsutil to delete it and it worked... temporarily. It seems there were like 40 cached versions of the file with hash tag ids.
At some point it stops updating / deleting the file. :(
gsutil rm -r gs://bucket/path/to/folder/

Related

Objects gets overwritten in S3 while governance mode along with legal hold is enabled

I'm an absolute beginner in AWS and have been practising for 3 months from now.
Recently I was working on S3 and playing a bit with S3 object lock. So I enabled S3 object lock for a specific object with governance mode along with legal hold. Now when I tried to overwrite the object with the same file using the following CLI command:
aws s3 cp /Users/John/Desktop/112133.jpg s3://my-buck/112133.jpg
It succeeded interestingly and I checked in the console that the new file is uploaded with Latest Version on it. Now I read this in AWS docs that:
Bypassing governance mode doesn't affect an object version's legal
hold status. If an object version has a legal hold enabled, the legal
hold remains in force and prevents requests to overwrite or delete the
object version.
Now my question is how it get overwritten if this CLI command is used to overwrite a file? I tried also in the console to re uplaod the same file but it also worked.
Moreover I uploaded another file and enabled ojbect lock with compliance mode and it also get overwritten. But deletion doesn't work for both cases as expected.
Did I understand something wrong about the whole S3 ojbect lock thing? Any help will be appreciated.
To quote the Object Lock documentation:
Object Lock works only in versioned buckets, and retention periods and
legal holds apply to individual object versions. When you lock an
object version, Amazon S3 stores the lock information in the metadata
for that object version. Placing a retention period or legal hold on
an object protects only the version specified in the request. It
doesn't prevent new versions of the object from being created.

Move large number of folders and files inside a GCS bucket

I have a bucket on GCP and at the top level of this bucket, I have a bunch of folders.
I want to create a new folder and move all of the other ones into it.
However, I've mounted my bucket with gcsfuse and tried traditional Linux mv commands. This is not allowed, apparently.
Likewise, I have also tried gsutil -m mv gs://mybucket/* gs://mybucket/new_folder/ and have received the command error that wildcards are not allowed in this operation.
What's the best option to get this large number of files moved into a new directory?
Posting this as a Community Wiki answer, based in the comments provided by #JohnHanley.
A few concepts to note for Cloud Storage.
Objects are immutable, which means you cannot rename then. You must copy objects and delete the original to emulate changing the name.
Directories/Folders do not exist. The namespace is flat, all objects are in the root directory. The appearance of folders is just a part of the object name.
Cloud Storage supports internal object copy. Be careful not to use a feature which first downloads the object and then uploads it.
Considering this information, you will need to use a tool, for example, the gsutil, so you can start to rename and move the files as you would like.

AWS S3 Storage's Object Lock settings not working

I have created an AWS S3 bucket with Object Lock settings for Compliance Mode. While I upload a file in the bucket (And the in the File settings I can see that the Object Lock is enabled in compliance mode), I was able to delete the file. I am not sure, as per the AWS documentation, even the root user cannot delete the file with Compliance Mode Object Lock.
Please help if I am misunderstood.
Important
Object locks apply to individual object versions only.
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html
Take a look at How Do I See the Versions of an S3 Object? and switch your console view to "show" object versions. You should find that you didn't actually delete the locked object version.
What you did when you "deleted" the object was create a delete marker.
A delete marker is a placeholder (marker) for a versioned object that was named in a simple DELETE request. Because the object was in a versioning-enabled bucket, the object was not deleted. The delete marker, however, makes Amazon S3 behave as if it had been deleted.
https://docs.aws.amazon.com/AmazonS3/latest/dev/DeleteMarker.html
With the console in the "hide" versions mode, delete requests are "simple DELETE requests" as mentioned above.
With the console in the "show" versions mode, delete operations you attempt are, instead, on specific versions of the object, and you should find that you are unable to delete any versions with object locks.
You'll also find that you can apparently overwrite an object with a new upload, but again you can't actually do that, because uploading an object with the same key in a versioned bucket (and enabling versioning is mandatory for object lock to work) doesn't overwrite the object -- it just creates a newer version of the object, leaving older versions intact.
When the top (newest, current) version of an object is a delete marker, the object disappears from the console and isn't included in ListObjects requests sent to the bucket via the API, but does appear in ListObjectVersions API requests. The "show/hide" setting is only applicable to your personal console view, it doesn't change actual bucket behavior.
The timestamps on object versions can't be altered, so locking an object version not only prevents deletion of the object contents, it also preserves a record of when that object was originally created. "Overwriting" an object creates a new version with a new timestamp, and the timestamps on the versions prove what content existed in the bucket at any given point in time.

Copying objects from one bucket directory folder to another bucket folder using transfer

I'm wanting to use google transfer to copy all folders/files in a specific directory in Bucket-1 to the root directory of Bucket-2.
Have tried to use transfer with the filter option but doesn't copy anything across.
Any pointers on getting this to work within transfer or step by step for functions would be really appreciated.
I reproduced your issue and worked for me using gsutil.
For example:
gsutil cp -r gs://SourceBucketName/example.txt gs://DestinationBucketName
Furthermore, I tried to copy using Transfer option and it also worked. The steps I have done with Transfer option are these:
1 - Create new Transfer Job
Panel: “Select Source”:
2 - Select your source for example Google Cloud Storage bucket
3 - Select your bucket with the data which you want to copy.
4 - On the field “Transfer files with these prefixes” add your data (I used “example.txt”)
Panel “Select destination”:
5 - Select your destination Bucket
Panel “Configure transfer”:
6 - Run now if you want to complete the transfer now.
7 - Press “Create”.
For more information about copy from a bucket to another you can check the official documentation.
So, a few things to consider here:
You have to keep in mind that Google Cloud Storage buckets don’t treat subdirectories the way you would expect. To the bucket it is basically all part of the file name. You can find more information about that in the How Subdirectories Work documentation.
The previous is also the reason why you cannot transfer a file that is inside a “directory” and expect to see only the file’s name appear in the root of your targeted bucket. To give you an example:
If you have a file at gs://my-bucket/my-bucket-subdirectory/myfile.txt, once you transfer it to your second bucket it will still have the subdirectory in its name, so the result will be: gs://my-second-bucket/my-bucket-subdirectory/myfile.txt
This is why, If you are interested in automating this process, you should definitely give the Google Cloud Storage Client Libraries a try.
Additionally, you could also use the GCS Client with Google Cloud Functions. However, I would just suggest this if you really need the Event Triggers offered by GCF. If you just want the transfer to run regularly, for example on a cron job, you could still use the GCS Client somewhere other than a Cloud Function.
The Cloud Storage Tutorial might give you a good example of how to handle Storage events.
Also, on your future posts, try to provide as much relevant information as possible. For this post, as an example, it would’ve been nice to know what file structure you have on your buckets and what you have been getting as an output. And If you can provide straight away what’s your use case, it will also prevent other users from suggesting solutions that don’t apply to your needs.
try this in Cloud Shell in the project
gsutil cp -r gs://bucket1/foldername gs://bucket2

How to rollback to previous version in Amazon S3 bucket?

I upload folders/files by:
aws s3 cp files s3://my_bucket/
aws s3 cp folder s3://my_bucket/ --recursive
Is there a way to return/rollback to previous version?
Like git revert or something similar?
Here the is test file that I uploaded 4 times.
How to get to previous version (make it the "Latest version")
For example make this "Jan 17, 2018 12:48:13" or "Jan 17, 2018 12:24:30"
to become the "Latest version" not in gui but by using command line?
Here is how to get that done:
If you are using cli,
https://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html
Get the object with the version you want.
Then perform a put object for the downloaded object.
https://docs.aws.amazon.com/cli/latest/reference/s3api/put-object.html
Your old S3 object will be the latest object now.
AWS S3 object is immutable and you can only put and delete. Rename is GET and PUT of the same object with a different name.
Hope it helps.
No. However, to protect against this in the future, you can enable versioning on your bucket and even configure the bucket to prevent automatic overwrites and deletes.
To enable versioning on your bucket, visit the Properties tab in the bucket and turn it on. After you have done so, copies or versions of each item within the bucket will contain version meta data and you will be able to retrieve older versions of the objects you have uploaded.
Once you have enabled versioning, you will not be able to turn it off.
EDIT (Updating my answer for your updated question):
You can't version your objects in this fashion. You are providing each object a unique Key, so S3 is treating it as a new object. You are going to need to use the same Key for each object PUTS to use versioning correctly. The only way to get this to work would be to GETS all of the objects from the bucket and find the most current date in the Key programmatically.
EDIT 2:
https://docs.aws.amazon.com/AmazonS3/latest/dev/RestoringPreviousVersions.html
To restore previous versions you can:
One of the value propositions of versioning is the ability to retrieve
previous versions of an object. There are two approaches to doing so:
Copy a previous version of the object into the same bucket The copied
object becomes the current version of that object and all object
versions are preserved.
Permanently delete the current version of the object When you delete
the current object version, you, in effect, turn the previous version
into the current version of that object.
I wasn't able to get answer I was looking to get for this question. I figured out myself by going to aws s3 console and would like to share here.
So, the quickest way is to simply navigate to:
--AWS Console -> to s3 console -> the bucket -> the s3 object
You will see the following:
At this point you can simpyl navigate to all your object
versions by clicking at the "Versions" and pick (download or move)
whichever version of the object you are interested in
S3 allows you to enable versioning for your bucket. If you have versioning on, you should be able to find previous versions back. If not, you are out of luck.
See the following page for more information: https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html