How to copy file from bucket GCS to my local machine

How to copy file from bucket GCS to my local machine - google-cloud-platform

I need copy files from Google Cloud Storage to my local machine:
I try this command o terminal of compute engine:
$sudo gsutil cp -r gs://mirror-bf /var/www/html/mydir
That is my directory on local machine /var/www/html/mydir.
i have that error:
CommandException: Destination URL must name a directory, bucket, or bucket
subdirectory for the multiple source form of the cp command.
Where the mistake?

You must first create the directory /var/www/html/mydir.
Then, you must run the gsutil command on your local machine and not in the Google Cloud Shell. The Cloud Shell runs on a remote machine and can't deal directly with your local directories.

I have had a similar problem and went through the painful process of having to figuring it out too, so I thought I would provide my step by step solution (under Windows, hopefully similar for unix users) with snapshots and hope it helps others:
The first thing (as many others have pointed out on various stackoverflow threads), you have to run a local Console (in admin mode) for this to work (ie. do not use the cloud shell terminal).
Here are the steps:
Assuming you already have Python installed on your machine, you will then need to install the gsutil python package using pip from your console:
pip install gsutil
The Console looks like this:
You will then be able to run the gsutil config from that same console:
gsutil config
As you can see from the snapshot bellow, a .boto file needs to be created. It is needed to make sure you have permissions to access your drive.
Also note that you are now provided an URL, which is needed in order to get the authorization code (prompted in the console).
Open a browser and paste this URL in, then:
Log in to your Google account (ie. account linked to your Google Cloud)
Google ask you to confirm you want to give access to GSUTIL. Click Allow:
You will then be given an authorization code, which you can copy and paste to your console:
Finally you are asked for a project-id:
Get the project ID of interest from your Google Cloud.
In order to find these IDs, click on "My First Project" as circled here below:
Then you will be provided a list of all your projects and their ID.
Paste that ID in you console, hit enter and here you are! You now have created your .boto file. This should be all you need to be able to play with your Cloud storage.
Console output:
Boto config file "C:\Users\xxxx\.boto" created. If you need to use a proxy to access the Internet please see the instructions in that file.
You will then be able to copy your files and folders from the cloud to your PC using the following gsutil Command:
gsutil -m cp -r gs://myCloudFolderOfInterest/ "D:\MyDestinationFolder"
Files from within "myCloudFolderOfInterest" should then get copied to the destination "MyDestinationFolder" (on your local computer).

gsutil -m cp -r gs://bucketname/ "C:\Users\test"
I put a "r" before file path, i.e., r"C:\Users\test" and got the same error. So I removed the "r" and it worked for me.

Check with '.' as ./var
$sudo gsutil cp -r gs://mirror-bf ./var/www/html/mydir
or maybe below problem
gsutil cp does not support copying special file types such as sockets, device files, named pipes, or any other non-standard files intended to represent an operating system resource. You should not run gsutil cp with sources that include such files (for example, recursively copying the root directory on Linux that includes /dev ). If you do, gsutil cp may fail or hang.
Source: https://cloud.google.com/storage/docs/gsutil/commands/cp

the syntax that worked for me downloading to a Mac was
gsutil cp -r gs://bucketname dir Dropbox/directoryname

Related

Trying to download organization data to an external drive

I am trying to backup all of our Google Cloud data to an external storage device.
There is a lot of data so I am attempting to download the entire bucket at once and am using the following command to do so, but it halts saying that there isn't enough storage on the device to complete the transfer.
gsutil -m cp -r \
"bucket name" \
.
What do I need to add to this command to download this information to my local D: drive? I have searched through the available docs and have not been able to find the answer.
I used the gsutil command that GCP provided for me automatically, but it seems to be trying to copy the files to a destination without enough storage to hold the needed data.

Remember that you are running the command from the Cloud Shell and not in a local terminal or Windows Command Line. If you inspect the Cloud Shell's file system/structure, it resembles more that of a Unix environment in which you can specify the destination like such instead: ~/bucketfiles/. Even a simple gsutil -m cp -R gs://bucket-name.appspot.com ./ will work since Cloud Shell can identify the ./ directory which is the current directory.
A workaround to this is to perform the command on your Windows Command Line. You would have to install Google Cloud SDK beforehand.
Alternatively, this can also be done in Cloud Shell, albeit with an extra step:
Download the bucket objects by running gsutil -m cp -R gs://bucket-name ~/ which will download it into the home directory in Cloud Shell
Transfer the files downloaded in the ~/ (home) directory from Cloud Shell to the local machine either through the User Interface or by running gcloud alpha cloud-shell scp.

how to use gsutil rsync. login and download bucket contents to a local directory

I have the following questions.
I got access to a cloud bucket to my email id. Now I want to download the whole bucket folder into a local directory on ubuntu. I installed gsutil from pip.
Is the command correct?
gsutil rsync gs://bucket_name .
the command seems generic how do I give my gmail credentials to it? The file is 1TB of size and I am allowed to download only once so I want to get the command right.

The command is correct if you want your current directory to mirror the contents of the bucket (including deleting any files on the right not found on the left). If you merely want to copy, you might want cp -r instead.
Here are the current docs on how to authenticate when running a standalone gsutil. It looks like you just need to run gsutil config.

cannot open path of the current working directory: Permission denied when trying to run rsync command twice in gcloud

I am trying to copy the data from the file stores to the google cloud bucket.
This is the command I am using:
gsutil rsync -r /fileserver/demo/dir1 gs://corp-bucket
corp-bucket: Name of my bucket
/fileserver/demo/dir1: Mount point directory (This directory contain the data of the file store)
This command works fine in the first time, It copies the data of the directory /fileserver/demo/dir1 to the cloud bucket but then I delete the data from the cloud bucket and again run the same command without any changes then I get this error:
cannot open path of the current working directory: Permission denied
NOTE: If I made even a small changes to the file of the /fileserver/demo/dir1 and run the above command then again it works fine but my question is why it is not working without any changes and is there any way to copy file without making any changes
Thanks.

You may be hitting the limitation #2 of rsync " The gsutil rsync command considers only the live object version in the source and destination buckets"; You can exclude /dir1... with the -x pattern and still let rsync makes the clean up work as the regular sync process.
Another way to copy those files will be to use cp with -r option to make it recursively instead of rsync.

How to exclude the particular folder in gcs bucket in google cloud while copying to local machine?

I am trying to copy the files and folders from google cloud storage to vm machine using gsutil command but i need to exclude few of the folders in the gcs bucket while copying to vm, i tried searching for the options but i couldn't find it, please help if anyone knows the command for this.
Thanks in-advance,

For this you can use a command like:
gsutil -m rsync -r -x '^dir3/*' gs://bucket
this should retrieve all objects located on the bucket, except objects beginning with dir3 (files not located in dir3 directory in your example).
Here you can find more details about the rsync command

How to download an entire bucket in GCP?

I have a problem downloading entire folder in GCP. How should I download the whole bucket? I run this code in GCP Shell Environment:
gsutil -m cp -R gs://my-uniquename-bucket ./C:\Users\Myname\Desktop\Bucket
and I get an error message: "CommandException: Destination URL must name a directory, bucket, or bucket subdirectory for the multiple source form of the cp command. CommandException: 7 files/objects could not be transferred."
Could someone please point out the mistake in the code line?

To download an entire bucket You must install google cloud SDK
then run this command
gsutil -m cp -R gs://project-bucket-name path/to/local
where path/to/local is your path of local storage of your machine

The error lies within the destination URL as specified by the error message.
I run this code in GCP Shell Environment
Remember that you are running the command from the Cloud Shell and not in a local terminal or Windows Command Line. Thus, it is throwing that error because it cannot find the path you specified. If you inspect the Cloud Shell's file system/structure, it resembles more that of a Unix environment in which you can specify the destination like such instead: ~/bucketfiles/. Even a simple gsutil -m cp -R gs://bucket-name.appspot.com ./ will work since Cloud Shell can identify the ./ directory which is the current directory.
A workaround to this issue is to perform the command on your Windows Command Line. You would have to install Google Cloud SDK beforehand.
Alternatively, this can also be done in Cloud Shell, albeit with an extra step:
Download the bucket objects by running gsutil -m cp -R gs://bucket-name ~/ which will download it into the home directory in Cloud Shell
Transfer the files downloaded in the ~/ (home) directory from Cloud Shell to the local machine either through the User Interface or by running gcloud alpha cloud-shell scp

Your destination path is invalid:
./C:\Users\Myname\Desktop\Bucket
Change to:
/Users/Myname/Desktop/Bucket
C: is a reserved device name. You cannot specify reserved device names in a relative path. ./C: is not valid.

There is not a one-button solution for downloading a full bucket to your local machine through the Cloud Shell.
The best option for an environment like yours (only using the Cloud Shell interface, without gcloud installed on your local system), is to follow a series of steps:
Downloading the whole bucket on the Cloud Shell environment
Zip the contents of the bucket
Upload the zipped file
Download the file through the browser
Clean up:
Delete the local files (local in the context of the Cloud Shell)
Delete the zipped bucket file
Unzip the bucket locally
This has the advantage of only having to download a single file on your local machine.
This might seem a lot of steps for a non-developer, but it's actually pretty simple:
First, run this on the Cloud Shell:
mkdir /tmp/bucket-contents/
gsutil -m cp -R gs://my-uniquename-bucket /tmp/bucket-contents/
pushd /tmp/bucket-contents/
zip -r /tmp/zipped-bucket.zip .
popd
gsutil cp /tmp/zipped-bucket.zip gs://my-uniquename-bucket/zipped-bucket.zip
Then, download the zipped file through this link: https://storage.cloud.google.com/my-uniquename-bucket/zipped-bucket.zip
Finally, clean up:
rm -rf /tmp/bucket-contents
rm /tmp/zipped-bucket.zip
gsutil rm gs://my-uniquename-bucket/zipped-bucket.zip
After these steps, you'll have a zipped-bucket.zip file in your local system that you can unzip with the tool of your choice.
Note that this might not work if you have too much data in your bucket and the Cloud Shell environment can't store all the data, but you could repeat the same steps on folders instead of buckets to have a manageable size.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js