Gsutil Terminal Command not actually updating files in the Cloud Console - google-cloud-platform

Linux Distro: Pop!_OS 19.10 Ubuntu
For context, I'm hosting a static website through Google Cloud Bucket.
So I tried executing the command
gsutil -m cp -r ejsout gs://evaapp.xyz
to push to my storage bucket.
The command copies all the files successfully, returning
/ [333/333 files][ 19.7 MiB/ 19.7 MiB] 100% Done 998.4 KiB/s ETA 00:00:00
Operation completed over 333 objects/19.7 MiB.
but when I go and look at the bucket online, the files aren't overwritten.
I've waited for hours or days and nothing happens to the files on Google Cloud Console. Only new files that haven't been created show up in the bucket console but not existing files. I have to manually update the files through cloud console for it to update.
Anyway to fix this? Feel free to close this issue because I'm bad at this and I couldn't find any helpful documentation on this. It's just annoying, I want to do the push from the terminal. Thanks!

Related

Downloading s3 bucket to local directory but files not copying?

There are many, many examples of how to download a directory of files from an s3 bucket to a local directory.
aws s3 cp s3://<bucket>/<directory> /<path>/<to>/<local>/ --recursive
However, I run this command from my AWS CLI that I've connected to and see confirmation in the terminal like:
download: s3://mybucket/myfolder/data1.json to /my/local/dir/data1.json
download: s3://mybucket/myfolder/data2.json to /my/local/dir/data2.json
download: s3://mybucket/myfolder/data3.json to /my/local/dir/data3.json
...
But then I check /my/local/dir for the files, and my directory is empty. I've tried using the sync command instead, I've tried copying just a single file - nothing seems to work right now. In the past I did successfully run this command and downloaded the files as expected.
Why are my files not being copied now, despite seeing no errors?
For testing you can go to your /my/local/dir folder and execute following command:
aws s3 sync s3://mybucket/myfolder .

Copy files to Container-Optimised OS from a GCP Storage bucket

How can one download files from a GCP Storage bucket to a Container-Optimised OS (COS) on instance startup?
I know of the following solutions:
gcloud compute copy-files
SSH through console
SCP
Yet all of these have to be done manually and externally after an instance is started.
There is also cloud init, yet I can't find any info on how to copy files from a Storage bucket. Examples seem to be suggesting that it's better to include content of files in the cloud init file directly, which is not something I want to do because security. Is it possible to download files from Storge bucket using cloud init?
I considered using a startup script, yet COS lacks CLI tools such as gcloud or gsutil to be able to run any such commands in a startup script.
I know I could copy the files manually and then save the image as a boot disk, but I'm hoping there are solutions that avoid having to do so.
Most of all, I'm assuming I'm not asking for something impossible, given that COS instance setup allows me to specify Docker volumes that I could mount onto the starting container. This seems to suggest I should be able to have some private files on the instance the moment COS will attempt to run my image on startup. But how?
Trying to execute a startup-script with a cloud-sdk image and copying files there as suggested by Guillaume didn't work for me for a while, showing this log. Eventually I realised that the cloud-sdk image is 2.41GB when uncompressed and takes over 2 minutes to complete pulling. I tried again with an empty COS instance and the startup script completed successfully, downloading the data from a Storage bucket.
However, a 2.41GB image and over 2 minutes of boot time sound like a bit of an overkill to download a 2KB file. Don't they?
I'm glad to see a working solution to my question (thanks Guillaume!) although I'm still wondering: isn't there a nicer way to do this? I feel that this method is even less tidy than manually putting the files on the COS instance and then creating a machine image to use in the future.
Based on Guillaume's answer I created and published a gsutil wrapper image, available as voyz/gsutil_wrap. This way I am able to run a startup-script with the following command:
docker run -v /host/path:/container/path \
--entrypoint gsutil voyz/gsutil_wrap \
cp gs://bucket/path /container/path
It's essentially a copy of what Guillaume suggested, except it is using an image containing only a minimum setup required to run gsutil. As a result it weighs 0.22GB and pulls within 10-20 seconds on average - as opposed to 2.41GB and over 2 minutes respectively for the google/cloud-sdk image suggested by Guillaume.
Also, credit to this incredibly useful StackOverflow answer that allows gsutil to use the default service account for authentication.
The startup-script is the correct location to do this. And YES, COS lacks some useful library.
BUT you can run container! And, for example, the Google Cloud SDK container!
So, add this startup-script in the VM metadata:
key -> startup-script
value ->
docker run -v /local/path/to/copy/files:/dummy/container/path \
--entrypoint gsutil google/cloud-sdk \
cp gs://your_bucket/path/to/file /dummy/container/path
Note: the startup script is ran in root mode. Perform a chmod/chown in your startup script if you need to change the file access mode.
Let me know if you need more explanation on this command line
Of course, with a fresh COS image, the startup time is quite long (pull the container image and extract it).
To reduce the startup time, you can "bake" your image. I mean, start with a COS, download/install what you want on it (or only perform a docker pull of the googkle/cloud-sdk container) and create a custom image from this.
Like this, all the required dependencies will be present on the image and the boot start will be quicker.

How to copy file from bucket GCS to my local machine

I need copy files from Google Cloud Storage to my local machine:
I try this command o terminal of compute engine:
$sudo gsutil cp -r gs://mirror-bf /var/www/html/mydir
That is my directory on local machine /var/www/html/mydir.
i have that error:
CommandException: Destination URL must name a directory, bucket, or bucket
subdirectory for the multiple source form of the cp command.
Where the mistake?
You must first create the directory /var/www/html/mydir.
Then, you must run the gsutil command on your local machine and not in the Google Cloud Shell. The Cloud Shell runs on a remote machine and can't deal directly with your local directories.
I have had a similar problem and went through the painful process of having to figuring it out too, so I thought I would provide my step by step solution (under Windows, hopefully similar for unix users) with snapshots and hope it helps others:
The first thing (as many others have pointed out on various stackoverflow threads), you have to run a local Console (in admin mode) for this to work (ie. do not use the cloud shell terminal).
Here are the steps:
Assuming you already have Python installed on your machine, you will then need to install the gsutil python package using pip from your console:
pip install gsutil
The Console looks like this:
You will then be able to run the gsutil config from that same console:
gsutil config
As you can see from the snapshot bellow, a .boto file needs to be created. It is needed to make sure you have permissions to access your drive.
Also note that you are now provided an URL, which is needed in order to get the authorization code (prompted in the console).
Open a browser and paste this URL in, then:
Log in to your Google account (ie. account linked to your Google Cloud)
Google ask you to confirm you want to give access to GSUTIL. Click Allow:
You will then be given an authorization code, which you can copy and paste to your console:
Finally you are asked for a project-id:
Get the project ID of interest from your Google Cloud.
In order to find these IDs, click on "My First Project" as circled here below:
Then you will be provided a list of all your projects and their ID.
Paste that ID in you console, hit enter and here you are! You now have created your .boto file. This should be all you need to be able to play with your Cloud storage.
Console output:
Boto config file "C:\Users\xxxx\.boto" created. If you need to use a proxy to access the Internet please see the instructions in that file.
You will then be able to copy your files and folders from the cloud to your PC using the following gsutil Command:
gsutil -m cp -r gs://myCloudFolderOfInterest/ "D:\MyDestinationFolder"
Files from within "myCloudFolderOfInterest" should then get copied to the destination "MyDestinationFolder" (on your local computer).
gsutil -m cp -r gs://bucketname/ "C:\Users\test"
I put a "r" before file path, i.e., r"C:\Users\test" and got the same error. So I removed the "r" and it worked for me.
Check with '.' as ./var
$sudo gsutil cp -r gs://mirror-bf ./var/www/html/mydir
or maybe below problem
gsutil cp does not support copying special file types such as sockets, device files, named pipes, or any other non-standard files intended to represent an operating system resource. You should not run gsutil cp with sources that include such files (for example, recursively copying the root directory on Linux that includes /dev ). If you do, gsutil cp may fail or hang.
Source: https://cloud.google.com/storage/docs/gsutil/commands/cp
the syntax that worked for me downloading to a Mac was
gsutil cp -r gs://bucketname dir Dropbox/directoryname

Disable progress output aws s3 sync without disabling all output

Is there any way to disable the
Completed 1 of 12 part(s) with 11 file(s) remaining...
progress output with the aws s3 sync command (from the aws cli tools).
I know there is a --quiet option but I don't want to use it because I still want the Upload... details in my logfile.
Not a big issue, but creates mess in the logfile like:
Completed 1 of 12 part(s) with 11 file(s) remaining^Mupload: local/file to s3://some.bucket/remote/file
Where ^M is a control character.
As of October 2017, it is possible to only suppress upload progress with aws s3 cp and aws s3 sync by using the --no-progress option:
--no-progress (boolean) File transfer progress is not displayed. This flag is only applied when the quiet and only-show-errors flags are not
provided.
Example:
aws s3 sync /path/to/directory s3://bucket/folder --no-progress
Output:
upload: /path/to/directory to s3://bucket/folder
I had a quick look at the CLI tools code and currently it is not possible to disable that message.
You should use --only-show-errors flag while running the command. Also, you would want --no-progress. This is going to minimize the logging.
More specs: https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
You can't disable the message completely. You can only delete by editing but when you run again, it would show up again.

AWS S3 upload fails: RequestTimeTooSkewed

I'm using
aws s3 sync ~/folder/ s3:// --delete
to upload (and sync) a large number of files to an S3 bucket. Some - but not all - of the files fail, throwing this error message:
upload failed: to s3://bucketname/folder/
A client error (RequestTimeTooSkewed) occurred when calling the UploadPart operation: The difference between the request time and the current time is too large
I know that the cause of this error is usually a local time that's out of sync with Internet time, but I'm running NTP (on my Ubuntu PC) and the date/time seem absolutely accurate - and this error has only been reported for about 15 out of the forty or so files I've uploaded so far.
Some of the files are relatively large - up to about 70MB each - and my upload speeds aren't fantastic: could S3 possibly be comparing the initial and completion times and reporting their difference as an error?
Thanks,
The time verification happens at the start of your upload to S3, so it won't be to do with files taking too long to upload.
Try comparing your system time with what S3 is reporting and see if there is any unnecessary time drift, just to make sure:
# Time from Amazon
$ curl http://s3.amazonaws.com -v
# Time on your local machine
$ date -u
(Time is returned in UTC)
I was running aws s3 cp inside docker container on a MacBook Pro and got this error. Restart the Docker for Mac fixed this issue.
Amazon S3 uses NTP for its system clocks, to sync with your clock.
Run
sudo apt-get install ntp
then open /etc/ntp.conf and add at the bottom
server 0.amazon.pool.ntp.org iburst
server 1.amazon.pool.ntp.org iburst
server 2.amazon.pool.ntp.org iburst
server 3.amazon.pool.ntp.org iburst
Then run
sudo service ntp restart
It now seems that multipart uploads were failing on aws s3. Using s3cmd instead works perfectly.
You have to sync you local time on your machine. The time is out of world time.
I'm having the issue on MacOS.
I fixed it by
Preference -> Date & Time -> check the box "Set date and time automatically"
Restarting the machine fixed this issue for me.
In cmd
Give aws configure
Set your default region name = us-east-1
Whatever it may be but there should be any ,
But not none
Default region name [none] -->> ×××××
Default region name [us-east-1] -->> √√
Create a bucket via GUI in the aws website and check its the time of creation
In the creation date
Note down that date and time from aws
and set the date and time of your pc with as same (which you have noted down)from settings in your pc
And now try to give the command in cmd
-> aws s3 ls