gsutil rsync with s3 buckets gives InvalidAccessKeyId error - amazon-web-services

I am trying to copy all the data from an AWS S3 bucket to a GCS bucket. Acc. to this answer rsync command should have been able to do that. But I am receiving the following error when trying to do that
Caught non-retryable exception while listing s3://my-s3-source/: AccessDeniedException: 403 InvalidAccessKeyId
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>{REDACTED}</AWSAccessKeyId><RequestId>{REDACTED}</RequestId><HostId>{REDACTED}</HostId></Error>
CommandException: Caught non-retryable exception - aborting rsync
This is the command I am trying to run
gsutil -m rsync -r s3://my-s3-source gs://my-gcs-destination
I have the AWS CLI installed which is working fine with the same AccessKeyId and listing buckets as well as objects in the bucket.
Any idea what am I doing wrong here?

gsutil can work with both Google Storage and S3.
gsutil rsync -d -r s3://my-aws-bucket gs://example-bucket
You just need to configure it with both - Google and your AWS S3 credentials. For GCP you need to add the Amazon S3 credentials to ~/.aws/credentials or you can also store your AWS credentials in the .boto configuration file for gsutil. However, when you're accessing an Amazon S3 bucket with gsutil, the Boto library uses your ~/.aws/credentials file to override other credentials, such as any that are stored in ~/.boto.
=== 1st update ===
Also make sure you have to make sure you have the correct IAM permissions on the GCP side and the correct AWS IAM credentials. Also depending if you have a prior version of Migrate for Compute Engine (formerly Velostrata) use this documentation and make sure you set up the VPN, IAM credentials and AWS network. If you are using the current version (5.0), use the following documentation to check everything is configured correctly.

Related

How to upload local system files in my Linux server to Amazon S3 Bucket using ssh?

I am trying to upload a file which I have on my Linux server onto my AWS S3 bucket. Can anyone please advise on how to do so as I only find documentations which is related to upload the files to EC2 instead.
I do have the .pem certificate present on my server directory.
I tried to run the following command but it doesn't solve the issue
scp -i My_PEM_FILE.pem "MY_FILE_TO_BE_UPLOADED.txt" MY_USER#S3-INSTANCE.IP.ADDRESS.0.compute.amazonaws.com
It is not possible to upload to Amazon S3 by using SSH.
The easiest way to upload from anywhere to an Amazon S3 bucket is to use the AWS Command-Line Interface (CLI):
aws s3 cp MY_FILE_TO_BE_UPLOADED.txt s3://my-bucket/
This will require an Access Key and a Secret Key to be stored via the aws configure command. You can obtain these keys from your IAM User in the IAM management console (Security Credentials tab).
See: aws s3 cp — AWS CLI Command Reference

How to download a directory from an s3 bucket?

I downloaded the aws cli with the macos gui installer:
https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
And I'm running this command to download the directory from an s3 bucket to local:
aws s3 cp s3://macbook-pro-2019-prit/Desktop/pfp/properties/ ./ --recursive
But I'm getting this error:
rosetta error: /var/db/oah/279281327407104_279281327407104/dcf7796bca04d6b4d944583b3355e7db61ca27505539c35142c439a9dbfe60d0/aws.aot: attachment of code signature supplement failed: 1
zsh: trace trap aws s3 cp s3://macbook-pro-2019-prit/Desktop/pfp/properties/ . --recursive
How do I fix this error?
If these steps doesn’t work, I should you verify just list first (aws s3 ls s3://macbook-pro-2019-prit) and check yours ACL, Policy or others access control on your bucket.
Although I believe that your issue is on your system operation, please confirm that your AWS user has policy with allow list and get to macbook-pro-2019-prit. And reinstall AWS cli.

How to re-format cURL command to AWS CLI s3 sync call

I have a curl command:
curl -C - "https://fastmri-dataset.s3.amazonaws.com/knee_singlecoil_train.tar.gz?AWSAccessKeyId=<my-key>&Signature=<my-signature>&Expires=1634085391
I'm having trouble using AWS CLI sync, I'm doing this:
aws s3 sync . s3://fastmri-dataset/knee_singlecoil_train.tar.gz
I have the aws configure file setup with the access key, I set the secret up with the signature. I didn't otherwise get a secret.
Any help is appreciated.
Your first link is an Amazon S3 pre-signed URLs, which is a time-limited URL that provides temporary access to a private object. It can be accessed via an HTTP/S call.
The AWS CLI command you have shown instructs the AWS CLI to synchronize the local directory with a file on S3. (This is actually incorrect, since you cannot sync a directory to a file.)
These two commands are incompatible. The AWS CLI cannot use a pre-signed URL. It requires AWS credentials with an Access Key and Secret Key to make AWS API calls.
So, you cannot reformat the curl command to an AWS CLI command.

Where to run the command to access private S3 bucket?

Apologies, this is such a rookie question. A report I set up is being run daily and deposited in the customer S3 bucket. I was given the command to run if I wanted to inspect the bucket contents. I want to verify my report is as expected in there, so I'd like to access it. But I have no idea where to actually run the command.
Do I need to install AWS CLI and run it there, is there something I need to install so I can run it from Terminal. The command has the AWS secret key, access key and URL.
If you wish to access an object from Amazon S3 on your own computer:
Download the AWS Command-Line Interface (CLI)
Run: aws configure and provide your Access Key & Secret Key
To list a bucket: aws s3 ls s3://bucket-name
To download an object: aws s3 cp s3://bucket-name/object-name.txt .
(That last period means "to the current directory".)

gsutil - issue with cp, rsync when using federated user AWS keys

I'm attempting a simple rsync (or cp) from AWS S3 to GCP Storage.
For e.g.
gsutil rsync -d -r -n s3://mycustomer-src gs://mycustomer-target
I get an error message as below when attempting this on a VM on GCP.
Note that if I install aws cli on the VM, then I can access / browse AWS S3 contents just fine. The AWS credentials are stored in ~/.aws/credentials file.
Building synchronization state...
Caught non-retryable exception while listing s3://musiclab-etl-dev/: AccessDeniedException: 403 InvalidAccessKeyId
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>ASIAJ3XGCQ7RGZYPD5UA</AWSAccessKeyId><RequestId>CE8919045C68DEC4</RequestId><HostId>i7oMBM61US3FyePJka8O+rjoHSo1rIZbRGnVZvIGkjEVPh6lXdbp03pZOtJ68F3pPdAAW1UvF5s=</HostId></Error>
CommandException: Caught non-retryable exception - aborting rsync
Is this a bug in gsutil ? Any workarounds or tips appreciated.
NOTE - The client's AWS account is setup for federated access and requires using AWS keys as obtained using a script similar to this-
https://aws.amazon.com/blogs/security/how-to-implement-a-general-solution-for-federated-apicli-access-using-saml-2-0/
The AWS keys are set to expire when the session token expires.
If I use a different AWS account (no federation) with typical AWS keys (non-expiring), the rsync (or cp) works fine.
It appears that gsutil still uses the legacy AWS_SECURITY_TOKEN instead of AWS_SESSION_TOKEN. If your script doesn't set it up automatically, you can do it manually like this:
export AWS_SECURITY_TOKEN=$AWS_SESSION_TOKEN
After this you should be able to use gsutil normally.