Rename a folder in GCS using gsutil - google-cloud-platform

I can't rename an existing folder in GCS. How do I do this?
As per the documentation, this should be:
gsutil mv gs://my_bucket/olddir gs://my_bucket/newdir
However, what happens is that olddir is placed under newdir, i.e the directory structure is like this (after the call to gsutil mv):
my_bucket
newdir
olddir
instead of (what I would expect)
my_bucket
newdir
I've tried all four combinations of putting trailing slashes or not, but none of them worked.

This is a confirmed bug in GCS, see https://issuetracker.google.com/issues/112817360
It actually only happens, when the directory name of newdir is a substring of olddir. So the gsutil call from the question actually works, but the following one would not:
gsutil mv gs://my-organization-empty-bucket/dir_old gs://my-organization-empty-bucket/dir

I reproduced your case by having a bucket with a folder named olddir of which I want to move the content to newdir folder.
the following command:
gsutils mv gs://<bucketname>/olddir gs://<bucketname>/newdir
moved the whole content of folder to the newly created newdir folder.
Olddir and newdir folders were then at the same level, in the bucket root.
after that I just had to remove the folder called olddir.
Objects in a bucket cannot be renamed.
The gsutil mv command does not remove the previous folder object like the mv comand would do in Unix CLI.
I guess that if you have tried moving folders several times by using "/" characters placed differently, the structure and hierarchy of the folders will have changed after issuing the initial command.
Please try again from the beginning.
Bear in mind that once you have a subfolder inside a folder, objects will have to be moved one by one using the full path.

Related

Folder name with date on GCP

I want to create a folder in GCP bucket with date as suffix:
I am trying this
gsutil mkdir gs://bucket_name/raw/data_"$(date +"%m-%d-%y")"
I also tried this:
dt="$(date +"%m-%d-%y")"
mkdir data_$dt
gsutil cp -r data_$dt gs://bucket_name/raw/
But in this getting error :
CommandException: No URLs matched
is there any other way?
Folders doesn't exist in Cloud Storage. The folder representation on the console is simply a human representation.
All the blobs are stored at the root of the bucket. The file name contain the path (that you name folder) and the effective name. Thus, if you add a file with a path, you see directories. If you remove it, all the directories disappeared.
Because of this, you can't filter on a file pattern, only on the path prefix.
So, the solution if you want to do this is to create a placeholder file
dt="$(date +"%m-%d-%y")"
mkdir data_$dt
touch data_$dt/placeholder
gsutil cp -r data_$dt gs://bucket_name/raw/

How to create a empty folder in google storage(bucket) using gsutil command?

How we can create the folder using gsutil command. I am using Bashoperator in airflow where I need to use the gsutil Bash command, Bucket is already created I want to create a folder inside bucket.
I already tried with below command but It's not working for me.
$ gsutil cp <new_folder> gs://<bucketname>/
I am getting error - CommandException: No URLs matched: new_folder
Google Storage does not work like a regular file system as in Windows/Linux. It appears to have folders but in the background it behaves as it does not. It only allows us to create "folders" so we can organize better and for our comfort.
If you want to save data in specific folders from gsutil try this.
gsutil cp [filetocopy] gs://your-bucket/folderyouwant/your-file
It will store the item in a "folder".
Check this link for more gsutil cp information.
This is the logic behind Google Cloud Storage "Folders".
gsutil will make a bucket listing request for the named bucket, using
delimiter="/" and prefix="abc". It will then examine the bucket
listing results and determine whether there are objects in the bucket
whose path starts with gs://your-bucket/abc/, to determine whether to
treat the target as an object name or a directory name. In turn this
impacts the name of the object you create: If the above check
indicates there is an "abc" directory you will end up with the object
gs://your-bucket/abc/your-file; otherwise you will end up with the
object gs://your-bucket/abc.
Here you have more interesting information about this if you want.
Apparently the ability to create an empty folder using gsutil is a request that has been seen a few times but not yet satisfied. There appears to be some workarounds by using API that can then be scripted. The GitHub issue for the ability to create empty folders through scripting can be found here:
https://github.com/GoogleCloudPlatform/gsutil/issues/388
You cannot create or copy an empty folder to GCS with gsutil as far as I researched and tried about it. Yes, it's inconvenient somehow.
A folder must not be empty to be created or copied to GCS and don't forget the flag "-r" to create or copy a folder to GCS as shown below otherwise you will get error if a folder is empty or you forgot the flag -r:
gsutil cp -r <non-empty-folder> gs://your-bucket
// "-r" is needed for folder

move files within source folder to destination folder without deleting source folder

I am trying to move sub folders from one directory of a S3 bucket to another directory in the same bucket. After moving files within the sub folder, the main directory gets deleted, which must not happen for me.
aws s3 mv s3://Bucket-Name/Input-List/$i/ s3://Bucket-Name/Input-List-Archive/$i/ --recursive
COLLECTION_LIST=(A B C D E F)
for i in ${COLLECTION_LIST[#]}
do
if [ $i == "A" -o $i == "B" ]
then
aws s3 mv s3://Bucket-Name/Input-List/$i/ s3://Bucket-Name/Input-List-Archive/$i/ --recursive
else
aws s3 mv s3://Bucket-Name/Input-List/Others/$i/ s3://Bucket-Name/Input-List-Archive/Others/$i/ --recursive
Here all files within Input-List must be moved to Input-List-Archive without Input-List directory being deleted.
How about writing a script to copy the files recursively from sub folders and deleting those files from sub folder instead of using mv command?
Firstly, please note that directories/folders do not actually exist in Amazon S3.
For example, I could run this command:
aws s3 cp foo.txt s3://my-bucket/folder1/folder2/foo.txt
This will work successfully, even if folder1 and folder2 do not exist.
The Amazon S3 management console will make those folders 'appear', but they do not actually exist.
If I then ran:
aws s3 rm s3://my-bucket/folder1/folder2/foo.txt
then the object would be deleted and the folders would 'disappear' (because they never actually existed).
Sometimes, however, people want a folder to appear. When a folder is created in the management console, a zero-length object is created with the Key (filename) set to the name of the folder. This will force an empty 'folder' to appear, but it is not actually a folder.
When listing objects in S3, API calls can return a common prefix which is similar in concept to a folder, but it is really just the "path portion" of a filename.
It is also worth mentioning that there is no "move" command in Amazon S3. Instead, when using the aws s3 mv command, the AWS CLI copies the object to a new object and then deletes the original object. This makes the object look like it was moved, but it actually was copied and deleted.
So, your options are:
Don't worry about folders. Just pretend they exist. They do not serve any purpose. OR
Create a new folder after the move. OR
Write your own program to Copy & Delete the objects without deleting the folder.
In fact, it is quite possible that the folder never existed in the first place (that is, there was no zero-length file with a Key matching the name of the folder), so it was never actually deleted. It's just that there was nothing to cause S3 to make the folder 'appear' to be there.

Google GSutil create folder

How can u create a new folder inside a bucket in google cloud storage using the gsutil command?
I tried using the same command in creating bucket but still got an error
gsutil mb -l us-east1 gs://my-awesome-bucket/new_folder/
Thanks!
The concept of directory is abstract in Google Cloud Storage. From the docs (How Subdirectories Work) :
gsutil provides the illusion of a hierarchical file tree atop the "flat" name space supported by the Google Cloud Storage service. To the service, the object gs://your-bucket/abc/def.txt is just an object that happens to have "/" characters in its name. There is no "abc" directory; just a single object with the given name.
So you cannot "create" a directory like in a traditional File System.
If you're clear about what folders and objects already exist in the bucket, then you can create a new 'folder' with gsutil by copying an object into the folder.
>mkdir test
>touch test/file1
>gsutil cp -r test gs://my-bucket
Copying file://test\file1 [Content-
Type=application/octet-stream]...
/ [1 files][ 0.0 B/ 0.0 B]
Operation completed over 1 objects.
>gsutil ls gs://my-bucket
gs://my-bucket/test/
>gsutil ls gs://my-bucket/test
gs://my-bucket/test/file1
It won't work if the local directory is empty.
More simply:
>touch file2
>gsutil cp file2 gs://my-bucket/new-folder/
Copying file://test\file2 [Content- ...
>gsutil ls gs://my-bucket/new-folder
gs://my-bucket/new-folder/file2
Be aware of the potential for Surprising Destination Subdirectory Naming. E.g. if the target directory already exists as an object. For an automated process, a more robust approach would be to use rsync.
I don't know if its possible to create an empty folder with gsutil. For that, use the console's Create Folder button.
You cannot create folders with gsutil as gsutil does not support it (workaround see below).
However, it is supported via:
UI in browser
write your own GCS client (we have written our own custom client which can create folders)
So even if Google has a flat name space structure as the other answer correctly points out, it still has the possibility to create single folders as individual objects. Unfortunately gsutil does not expose this.
(Ugly) workaround with gsutil: Add a dummy file into a folder and upload this dummy file - but the folder will be gone once you delete this file, unless other files in that folder are present.
Copied from Google cloud help:
Copy the object to a folder in the bucket
Use the gsutil cp command to create a folder and copy the image into it:
gsutil cp gs://my-awesome-bucket/kitten.png gs://my-awesome-bucket/just-a-folder/kitten3.png
This works.
You cannot create a folder with gsutil on GCS.
But you can copy an existing folder with gsutil to GCS.
To copy an existing folder with gsutil to GCS, a folder must not be empty and the flag "-r" is needed as shown below otherwise you will get error if a folder is empty or you forgot the flag -r:
gsutil cp -r <non-empty-folder> gs://your-bucket
// "-r" is needed for folder
You cannot create an empty folder with mb

AWS S3: How to delete all contents of a directory in a bucket but not the directory itself?

I have an AWS S3 bucket entitled static.mysite.com
This bucket contains a directory called html
I want to use the AWS Command Line Interface to remove all contents of the html directory, but not the directory itself. How can I do it?
This command deletes the directory too:
aws s3 rm s3://static.mysite.com/html/ --recursive
I don't see the answer to this question in the manual entry for AWS S3 rm.
Old question, but I didn't see the answer here. If you have a use case to keep the 'folder' prefix, but delete all the files, you can use --exclude with an empty match string. I found the --exclude "." and --exclude ".." options do not prevent the folder from being deleted. Use this:
aws s3 rm s3://static.mysite.com/html/ --recursive --exclude ""
I just want to confirm how the folders were created...
If you created the "subA" folder manually and then deleted the suba1 folder, you should find that the the "subA" folder remains. When you create a folder manually, you are actually creating a folder "object" which is similar to any other file/object that you upload to S3.
However, if a file was uploaded directly to a location in S3 (when the "subA" and "suba1" folder don't exist yet) you'll find that the "subA" and "suba1" folders are created automatically. You can do this using something like the AWS CLI tool e.g:
aws s3 cp file1.txt s3://bucket/subA/suba1/file1.txt
If you now delete file1.txt, there will no longer be any objects within the "subA" folder and you'll find that the "subA" and "suba1" folders no longer exist.
If another file (file2.txt) was uploaded to the path "bucket/subA/file2.txt", and you deleted file1.txt (from the previous example) you'll find that the "subA" folder remains and the "suba1" folder disappears.
https://forums.aws.amazon.com/thread.jspa?threadID=219733
aws s3 rm s3://static.mysite.com/html/ --recursive --exclude ""
this command worked for me to delete all the files but not the folder.