Folder name with date on GCP - google-cloud-platform

I want to create a folder in GCP bucket with date as suffix:
I am trying this
gsutil mkdir gs://bucket_name/raw/data_"$(date +"%m-%d-%y")"
I also tried this:
dt="$(date +"%m-%d-%y")"
mkdir data_$dt
gsutil cp -r data_$dt gs://bucket_name/raw/
But in this getting error :
CommandException: No URLs matched
is there any other way?

Folders doesn't exist in Cloud Storage. The folder representation on the console is simply a human representation.
All the blobs are stored at the root of the bucket. The file name contain the path (that you name folder) and the effective name. Thus, if you add a file with a path, you see directories. If you remove it, all the directories disappeared.
Because of this, you can't filter on a file pattern, only on the path prefix.
So, the solution if you want to do this is to create a placeholder file
dt="$(date +"%m-%d-%y")"
mkdir data_$dt
touch data_$dt/placeholder
gsutil cp -r data_$dt gs://bucket_name/raw/

Related

Find a specific filename in a specific subdirectory with gsutil ls

I have a Google Cloud Platform Storage bucket bucket with a top-level-folder. I want to list a file with specific extension that could be located in any sub-directory withing the top the level folder. How can I do that?
Basically I am having trouble using the glob pattern twice.
gsutil ls gs://bucket/top-level-folder/*/**<sub-directory>/**/*.<extension>
The way to do it is:
gsutil ls gs://bucket/top-level-folder/**<sub-directory>**.<extension>
This will list all the files under top-level-directory which end in the desired extension
EDIT:
I changed the code to include the subdirectory on the gsutil command.

Exclude a certain file or directory while copying from a Google Cloud Storage

I wish to copy all files except a certain directory or directories (or files) from my GCS bucket to my local directory. Is there anyway I can do the same?
For example:
My GCS bucket named so-bucket has three folders dir1, dir2, dir3, file1 and file2. I want to copy all the files and directories except dir3 from the bucket to my local directory.
Usually I would do gsutil -m cp -r gs://so-bucket/* . and then delete the dir3 folder.
You can use gsutil rsync command, with -x option to exclude some objects. Something like :
gsutil -m rsync -r -x '^dir3/*' gs://so-bucket .
should retrieve all objects located on the bucket, except objects beginning with dir3 (files not located in dir3 directory in your example).

Google GSutil create folder

How can u create a new folder inside a bucket in google cloud storage using the gsutil command?
I tried using the same command in creating bucket but still got an error
gsutil mb -l us-east1 gs://my-awesome-bucket/new_folder/
Thanks!
The concept of directory is abstract in Google Cloud Storage. From the docs (How Subdirectories Work) :
gsutil provides the illusion of a hierarchical file tree atop the "flat" name space supported by the Google Cloud Storage service. To the service, the object gs://your-bucket/abc/def.txt is just an object that happens to have "/" characters in its name. There is no "abc" directory; just a single object with the given name.
So you cannot "create" a directory like in a traditional File System.
If you're clear about what folders and objects already exist in the bucket, then you can create a new 'folder' with gsutil by copying an object into the folder.
>mkdir test
>touch test/file1
>gsutil cp -r test gs://my-bucket
Copying file://test\file1 [Content-
Type=application/octet-stream]...
/ [1 files][ 0.0 B/ 0.0 B]
Operation completed over 1 objects.
>gsutil ls gs://my-bucket
gs://my-bucket/test/
>gsutil ls gs://my-bucket/test
gs://my-bucket/test/file1
It won't work if the local directory is empty.
More simply:
>touch file2
>gsutil cp file2 gs://my-bucket/new-folder/
Copying file://test\file2 [Content- ...
>gsutil ls gs://my-bucket/new-folder
gs://my-bucket/new-folder/file2
Be aware of the potential for Surprising Destination Subdirectory Naming. E.g. if the target directory already exists as an object. For an automated process, a more robust approach would be to use rsync.
I don't know if its possible to create an empty folder with gsutil. For that, use the console's Create Folder button.
You cannot create folders with gsutil as gsutil does not support it (workaround see below).
However, it is supported via:
UI in browser
write your own GCS client (we have written our own custom client which can create folders)
So even if Google has a flat name space structure as the other answer correctly points out, it still has the possibility to create single folders as individual objects. Unfortunately gsutil does not expose this.
(Ugly) workaround with gsutil: Add a dummy file into a folder and upload this dummy file - but the folder will be gone once you delete this file, unless other files in that folder are present.
Copied from Google cloud help:
Copy the object to a folder in the bucket
Use the gsutil cp command to create a folder and copy the image into it:
gsutil cp gs://my-awesome-bucket/kitten.png gs://my-awesome-bucket/just-a-folder/kitten3.png
This works.
You cannot create a folder with gsutil on GCS.
But you can copy an existing folder with gsutil to GCS.
To copy an existing folder with gsutil to GCS, a folder must not be empty and the flag "-r" is needed as shown below otherwise you will get error if a folder is empty or you forgot the flag -r:
gsutil cp -r <non-empty-folder> gs://your-bucket
// "-r" is needed for folder
You cannot create an empty folder with mb

Rename a folder in GCS using gsutil

I can't rename an existing folder in GCS. How do I do this?
As per the documentation, this should be:
gsutil mv gs://my_bucket/olddir gs://my_bucket/newdir
However, what happens is that olddir is placed under newdir, i.e the directory structure is like this (after the call to gsutil mv):
my_bucket
newdir
olddir
instead of (what I would expect)
my_bucket
newdir
I've tried all four combinations of putting trailing slashes or not, but none of them worked.
This is a confirmed bug in GCS, see https://issuetracker.google.com/issues/112817360
It actually only happens, when the directory name of newdir is a substring of olddir. So the gsutil call from the question actually works, but the following one would not:
gsutil mv gs://my-organization-empty-bucket/dir_old gs://my-organization-empty-bucket/dir
I reproduced your case by having a bucket with a folder named olddir of which I want to move the content to newdir folder.
the following command:
gsutils mv gs://<bucketname>/olddir gs://<bucketname>/newdir
moved the whole content of folder to the newly created newdir folder.
Olddir and newdir folders were then at the same level, in the bucket root.
after that I just had to remove the folder called olddir.
Objects in a bucket cannot be renamed.
The gsutil mv command does not remove the previous folder object like the mv comand would do in Unix CLI.
I guess that if you have tried moving folders several times by using "/" characters placed differently, the structure and hierarchy of the folders will have changed after issuing the initial command.
Please try again from the beginning.
Bear in mind that once you have a subfolder inside a folder, objects will have to be moved one by one using the full path.

How to download all bucket files using gsutil with django-gae app

I want to download all bucket files using gsutil in my django non-rel gae app.
gsutil -m cp -R gs://"BUCKET NAME" .
Replace "BUCKET NAME" with the name of your bucket(no quotes) and don't forget the period at the end. If you want to specify a folder replace the period with the destination folder.
-m to perform a parallel (multi-threaded/multi-processing) copy
-R to copy an entire directory tree
More details can be found here