I’m trying to sync one aws bucket to an another bucket across different iam accounts.
How can I do it periodically so any file written to the source bucket will automatically transforms to the destination? Do I need to use lambdas to execute aws cli sync command?
Thanks
Option 1: AWS CLI Sync
You could run aws s3 sync on a regular basis, which will only copy new/changed files. This makes it very efficient. However, if there is a large number of files (10,000+) then it will take a long time trying to determine which files need to be copied. You will also need to schedule the command to run somewhere (eg a cron job).
Option 2: AWS Lambda function
You could create an AWS Lambda function that is triggered by Amazon S3 whenever a new object is created. The Lambda function will be passed details of the Bucket & Object via the event parameter. The Lambda function could then call CopyObject() to copy the object immediately. The advantage of this method is that the objects are copied as soon as they are created.
(Do not use an AWS Lambda function to call the AWS CLI. The above function would be called for each file individually.)
Option 3: Amazon S3 Replication
You can configure Amazon S3 Replication to automatically replicate newly-created objects between the buckets (including buckets between different AWS Accounts). This is the simplest option since it does not require any coding.
Permissions
When copying S3 objects between accounts, you will need to use a single set of credentials that has both Read permission on the source bucket and Write permission on the target bucket. This can be done in two ways:
Use credentials (IAM User or IAM Role) from the source account that have permission to read the source bucket. Create a bucket policy on the target bucket that permits those credentials to PutObject into the bucket. When copying, specify ACL=public-read to grant object ownership to the destination account.
OR
Use credentials from the target account that have permission to write to the target bucket. Create a bucket policy on the source bucket that permits those credentials to GetObject from the bucket.
Related
I have one AWS S# and Redshift question:
A company uses two AWS accounts for accessing various AWS services. The analytics team has just configured an Amazon S3 bucket in AWS account A for writing data from the Amazon Redshift cluster provisioned in AWS account B. The team has noticed that the files created in the S3 bucket using UNLOAD command from the Redshift cluster are not accessible to the bucket owner user of the AWS account A that created the S3 bucket.
What could be the reason for this denial of permission for resources belonging to the same AWS account?
I tried to reproduce the scenario for the question, but I can't.
I don't get the S3 Object Ownership and Bucket Ownership.
You are not the only person confused by Amazon S3 object ownership. When writing files from one AWS Account to a bucket owned by a different AWS Account, is possible for the 'ownership' of objects to remain with the 'sending' account. This causes all types of problems.
Fortunately, AWS has introduced a new feature into S3 called Edit Object Ownership that overrides all these issues:
By setting "ACLs disabled" for an S3 Bucket, objects will always be owned by the AWS Account that owns the bucket.
So, you should configure this option for the S3 bucket in your AWS account B and it should all work nicely.
The problem is that the bucket owner in account A does not have access to files that were uploaded by the account B, usually that is solved by specifying acl parameter when uploading files --acl bucket-owner-full-control. Since the upload is done via Redshift you need to tell Redshift to assume a role in the account A for UNLOAD command so files don't change the ownership and continue to belong to account A. Check the following page for more examples on configuring cross account LOAD/UNLOAD https://aws.amazon.com/premiumsupport/knowledge-center/redshift-s3-cross-account/
Can we replicate the whole bucket like its event notification, lifecycle policy, ACL Permission and aother set up from one account to another account in AWS.
I know there is s3 copy(s3 cp) and s3 sync is there but it is only copying data not replicate whole s3 bucket.
we have 50000 buckets in one account we need to replicate all 50000 buckets with data into another AWS account. so it would replicate the whole bucket (data+Confgurations)
Any idea would be really helpful for me.
We did
aws s3 sync s3://SOURCE-BUCKET-NAME s3://DESTINATION-BUCKET-NAME --source-region SOURCE-REGION-NAME --region DESTINATION-REGION-NAME
There are no commands available to replicate "bucket configurations". You would need to:
Loop through each source bucket
Make API calls to discover the configurations
Make API calls to create the destination buckets and create similar configurations (but be careful -- you probably don't want to replicate things like notifications since they wouldn't be valid in a different account)
I have access_key, access_id for both of the aws bucket belong to a different account. I have to copy data from one location to another, is there a way to do it faster.
I have tried map-reduced-based distcp that does not provide satisfactory performance.
The best way to copy data between Amazon S3 buckets in different accounts is to use a single set of credentials that has permission to read from the source bucket and write to the destination bucket.
You can then use these credentials with the CopyObject() command, which will copy the object between the S3 buckets without the need to download and upload the objects. The copy will be fully managed by the Amazon S3 service, even if the buckets are in different accounts and even different regions. The copy will not involve transferring any data to/from your own computer.
If you use the AWS CLI aws s3 cp --recusive or aws s3 sync commands, the copies will be performed in parallel, making very fast copies of the objects.
There are two ways to perform a copy:
Push
Use a set of credentials from the Source account that has permission to read from the source bucket
Add a Bucket Policy on the destination bucket that permits Write access for these credentials
When performing the copy, use ACL=bucket-owner-full-control to assign ownership of the object to the destination account
OR
Pull
Use a set of credentials from the Destination account that has permission to write to the destination bucket
Add a Bucket Policy on the source bucket that permits Read access for these credentials
(No ACL is required because "pulling" the file will automatically give ownership to the account issuing the command)
I am trying to move client data from clients S3 bucket(s3://client-bucket) to our organizations S3 bucket(s3://org-bucket) I was given access keys to the clients S3 bucket.
Using AWS CLI i am able to access S3 bucket of client as see all files. I cannot however use aws s3 mv because the profile that has access to client-bucket does not have permissions set up for org-bucket.
I am not allowed to move data to an intermediate public bucket bc of security issues/sensitivity of data.
What is the best way of making this transfer go thru? Is there a way to set up a profile in aws cli config/credentials with both the access keys to org-bucket and client-bucket?
The best way is to use the access keys in your organization to access your client's S3 bucket. Since you need to copy objects directly via the CopyObject API, your IAM user/role needs to have access to both the S3 bucket in your org AND your client's S3 bucket. Therefore, your current approach doesn't work and even AssumeRole would not work either. You can follow this guide to configure proper resource-based policies in S3.
I have data that arrives in S3 Account A that i want to automatically copy to S3 Account B but do not understand how i can reference the files in Account A in my Lambda in Account B to do the copy.
Completed Steps so far:
1 Account B Inline policy added to Execution Role referencing Account A S3 bucket
2 Account B Permission given to Account A to invoke Lambda
3 Account A Bucket policy allowing S3 access to role execution Role Account B
4 Account A Event Notification to Account B Lambda (All ObjectCredte events)
Am i missing some steps or is here and if not how can my Lambda directly reference the individual files captured by the event?
Update due to comments:
From the question above, I'm not sure I understand the setup, but here's how I would approach this from an architectural perspective:
A Lambda function inside account A gets triggered by the S3 event when an object is uploaded.
The Lambda function retrieves the uploaded object from the source bucket
The Lambda function assumes a role in account B, which grants permission to write into the target bucket.
The Lambda function writes the object into the target bucket.
The permissions you need are:
An execution role for the Lambda function in account A that (a) grants permission to read from the source bucket and (b) grants permission to assume the role in account B (see next item below)
A cross-account role in account B, (a) trusting the above execution role and (b) granting permission to write into the target bucket
Note: Make sure to save the object granting bucket-owner-full-control so that account B has permissions to use the copied object.
If you want to replicate the objects to a bucket in a different AWS account and don't care about the fact that it can take up to 15 minutes for the replication to be done, you don’t need to build anything yourself. Simply use the Amazon S3 Replication feature.
Replication enables automatic, asynchronous copying of objects across
Amazon S3 buckets. Buckets that are configured for object replication
can be owned by the same AWS account or by different accounts. You can
copy objects between different AWS Regions or within the same Region.