What does this error message mean in gsutil - google-cloud-platform

Can anyone give advice on debugging why this command isn't working on MacOS Catalina?
~ $ gsutil cp -r gs://elium/photo.video/ /Users/alex -v
CommandException: Destination URL must name a directory, bucket, or bucket
subdirectory for the multiple source form of the cp command.
/Users/alex is definitely an existing folder

You have the token -v at the end of your command, and gsutil thinks that is the destination path. When you provide multiple args to gsutil cp, it thinks all args except the last arg are sources (the objects/files to be copied), and the last arg is the destination (the folder/bucket to copy the files into).
The -v flag should go before all of your non-flag arguments. Try using cp -r -v.

#mhouglum was right!
Also, I'll add that the trailing slash in the source URL broke things silently. (I ran the command but got no output) The final working cmd was:
gsutil cp -r -v gs://elium/photo.video /Users/alex

Related

Download Google Cloud Bucket to Specific Folder on Local Machine

I'm trying to download an entire bucket to my local machine.
I'm aware that the command to do this is:
gsutil -m cp -r \ "gs://bucket_name/folder_name/" \ .
However, I'd like to specify exactly where this gets downloaded on my machine due to storage limitations.
Can anyone share any advice regarding this?
Thanks in advance,
Tommy
You can place your download files where you want to put by adding a destination url value to the last parameter of the gsutil cp command you are using, for example:
gsutil -m cp -r "gs://bucket_name" "D:\destination_folder"

How to download an entire bucket in GCP?

I have a problem downloading entire folder in GCP. How should I download the whole bucket? I run this code in GCP Shell Environment:
gsutil -m cp -R gs://my-uniquename-bucket ./C:\Users\Myname\Desktop\Bucket
and I get an error message: "CommandException: Destination URL must name a directory, bucket, or bucket subdirectory for the multiple source form of the cp command. CommandException: 7 files/objects could not be transferred."
Could someone please point out the mistake in the code line?
To download an entire bucket You must install google cloud SDK
then run this command
gsutil -m cp -R gs://project-bucket-name path/to/local
where path/to/local is your path of local storage of your machine
The error lies within the destination URL as specified by the error message.
I run this code in GCP Shell Environment
Remember that you are running the command from the Cloud Shell and not in a local terminal or Windows Command Line. Thus, it is throwing that error because it cannot find the path you specified. If you inspect the Cloud Shell's file system/structure, it resembles more that of a Unix environment in which you can specify the destination like such instead: ~/bucketfiles/. Even a simple gsutil -m cp -R gs://bucket-name.appspot.com ./ will work since Cloud Shell can identify the ./ directory which is the current directory.
A workaround to this issue is to perform the command on your Windows Command Line. You would have to install Google Cloud SDK beforehand.
Alternatively, this can also be done in Cloud Shell, albeit with an extra step:
Download the bucket objects by running gsutil -m cp -R gs://bucket-name ~/ which will download it into the home directory in Cloud Shell
Transfer the files downloaded in the ~/ (home) directory from Cloud Shell to the local machine either through the User Interface or by running gcloud alpha cloud-shell scp
Your destination path is invalid:
./C:\Users\Myname\Desktop\Bucket
Change to:
/Users/Myname/Desktop/Bucket
C: is a reserved device name. You cannot specify reserved device names in a relative path. ./C: is not valid.
There is not a one-button solution for downloading a full bucket to your local machine through the Cloud Shell.
The best option for an environment like yours (only using the Cloud Shell interface, without gcloud installed on your local system), is to follow a series of steps:
Downloading the whole bucket on the Cloud Shell environment
Zip the contents of the bucket
Upload the zipped file
Download the file through the browser
Clean up:
Delete the local files (local in the context of the Cloud Shell)
Delete the zipped bucket file
Unzip the bucket locally
This has the advantage of only having to download a single file on your local machine.
This might seem a lot of steps for a non-developer, but it's actually pretty simple:
First, run this on the Cloud Shell:
mkdir /tmp/bucket-contents/
gsutil -m cp -R gs://my-uniquename-bucket /tmp/bucket-contents/
pushd /tmp/bucket-contents/
zip -r /tmp/zipped-bucket.zip .
popd
gsutil cp /tmp/zipped-bucket.zip gs://my-uniquename-bucket/zipped-bucket.zip
Then, download the zipped file through this link: https://storage.cloud.google.com/my-uniquename-bucket/zipped-bucket.zip
Finally, clean up:
rm -rf /tmp/bucket-contents
rm /tmp/zipped-bucket.zip
gsutil rm gs://my-uniquename-bucket/zipped-bucket.zip
After these steps, you'll have a zipped-bucket.zip file in your local system that you can unzip with the tool of your choice.
Note that this might not work if you have too much data in your bucket and the Cloud Shell environment can't store all the data, but you could repeat the same steps on folders instead of buckets to have a manageable size.

Gsutil creates 'tmp' files in the buckets

We are using automation scripts to upload thousands of files from MAPR HDFS to GCP storage. Sometimes the files in the main bucket appear with tmp~!# suffix it causes failures in our pipeline.
Example:
gs://some_path/.pre-processing/file_name.gz.tmp~!#
We are using rsync -m and in certain cases cp -I
some_file | gsutil -m cp -I '{GCP_DESTINATION}'
gsutil -m rsync {MAPR_SOURCE} '{GCP_DESTINATION}'
It's possible that copy attempt failed and retried later from a different machine, eventually, we have both the file and another one with the tmp~!# suffix
I'd want to get rid of these files without actively looking for them.
we have gsutil 4.33, appreciate any lead. Thx

Error "No URLs matched" When copying Google cloud bucket data to my local computer?

I am trying to download a folder which is inside my Google Cloud Bucket, I read from google docs gsutil/commands/cp and executed below the line.
gsutil cp -r appengine.googleapis.com gs://my-bucket
But i am getting the error
CommandException: No URLs matched: appengine.googleapis.com
Edit
By running below command
gsutil cp -r gs://logsnotimelimit .
I am getting Error
IOError: [Errno 22] invalid mode ('ab') or filename: u'.\logsnotimelimit\appengine.googleapis.com\nginx.request\2018\03\14\14:00:00_14:59:59_S0.json_.gstmp'
What is the appengine.googleapis.com parameter in your command? Is that a local directory on your filesystem you are trying to copy to the cloud bucket?
The gsutil cp -r appengine.googleapis.com gs://my-bucket command you provided will copy a local directory named appengine.googleapis.com recursively to your cloud bucket named my-bucket. If that's not what you are doing - you need to construct your command differently.
I.e. to download a directory named folder from your cloud bucket named my-bucket into the current location try running
gsutil cp -r gs://my-bucket/folder .
-- Update: Since it appears that you're using a Windows machine (the "\" directory separators instead of "/" in the error message) and since the filenames contain the ":" character - the cp command will end up failing when creating those files with the error message you're seeing.
Just wanted to help people out if they run into this problem on Windows. As administrator:
Open C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\gsutil\gslib\utils
Delete copy_helper.pyc
Change the permissions for copy_helper.py to allow writing
Open copy_helper.py
Go to the function _GetDownloadFile
On line 2312 (at time of writing), change the following line
download_file_name = _GetDownloadTempFileName(dst_url)
to (for example, objective is to remove the colons):
download_file_name = _GetDownloadTempFileName(dst_url).replace(':', '-')
Go to the function _ValidateAndCompleteDownload
On line 3184 (at time of writing), change the following line
final_file_name = dst_url.object_name
to (for example, objective is to remove the colons):
final_file_name = dst_url.object_name.replace(':', '-')
Save the file, and rerun the gsutil command
FYI, I was using the command gsutil -m cp -r gs://my-bucket/* . to download all my logs, which by default contain : which does not bode well for Windows files!
Hope this helps someone, I know it's a somewhat hacky solution, but seeing as you never need (should have) colons in Windows filenames, it's fine to do and forget. Just remember that if you update the Google SDK you'll have to redo this.
I got same issue and resolved it as below.
Open a cloud shell, and copy objects by using gsutil command.
gsutil -m cp -r gs://[some bucket]/[object] .
On the shell, zip those objects by using zip command.
zip [some file name].zip -r [some name of your specific folder]
On the shell, copy the zip file into GCS by using gsutil command.
gsutil cp [some file name].zip gs://[some bucket] .
On a Windows Command Prompt, copy the zip file in GCS by using gsutil command.
gsutil cp gs://[some bucket]/[some file name].zip .
I wish this information helps someone.
This is also gsutil's way of saying file not found. The mention of URL is just confusing in the context of local files.
Be careful, in this command, the file path is case sensitive. You can check if it is not a capitalized letter issue.

Where can I find the folder which I downloaded from gcloud bucket

By using gcloud shell I have downloaded all my bucket but i couldn't find the downloaded files.
I used the command
gsutil -m cp -R gs://bucket/* .
P.S. Please don't make -1 on that post if I asked something wrong let me know in comments and I will learn how to ask a question correctly and save your time. Thanks
You used the command gsutil cp, as documented here:
https://cloud.google.com/storage/docs/gsutil/commands/cp
The parameters for this command are:
gsutil cp [OPTION]... src_url dst_url
So you used Option gsutil -m for to perform a parallel (multi-threaded/multi-processing) copy.
Then you also added -R to traverse all directories in your bucket
As "destination URL" you entered a "." which specified the current working directory.
So your files should be located in your home directory, or in any directory where you switched to using the cd command inside your command window.
It would download to the directory you were in when you ran the command. If you never changed the directory using $cd ... command, then it should be at the root. On a Mac, that would be Macintosh > Users > YourName.