We are able to put objects into our S3 Bucket.
But now we have a requirement that we need to put these Object directly to an S3 Bucket which belongs to a different account and different region.
Here we have few questions:
Is this possible?
If possible what changes we need to do for this?
They have provided us Access Key, Secret Key, Region, and Bucket details.
Any comments and suggestions will be appreciated.
IAM credentials are associated with a single AWS Account.
When you launch your own Amazon EC2 instance with an assigned IAM Role, it will receive access credentials that are associated with your account.
To write to another account's Amazon S3 bucket, you have two options:
Option 1: Your credentials + Bucket Policy
The owner of the destination Amazon S3 bucket can add a Bucket Policy on the bucket that permits access by your IAM Role. This way, you can just use the normal credentials available on the EC2 instance.
Option 2: Their credentials
It appears that you have been given access credentials for their account. You can use these credentials to access their Amazon S3 bucket.
As detailed on Working with AWS Credentials - AWS SDK for Java, you can provide these credentials in several ways. However, if you are using BOTH the credentials provided by the IAM Role AND the credentials that have been given to you, it can be difficult to 'switch between' them. (I'm not sure if there is a way to tell the Credentials Provider to switch between a profile stored in the ~/.aws/credentials file and those provided via instance metadata.)
Thus, the easiest way is to specify the Access Key and Secret Key when creating the S3 client:
BasicAWSCredentials awsCreds = new BasicAWSCredentials("access_key_id", "secret_key_id");
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(new AWSStaticCredentialsProvider(awsCreds))
.build();
It is generally not a good idea to put credentials in your code. You should load them from a configuration file.
Yes, it's possible. You need to allow cross account S3 put operation in bucket's policy.
Here is a blog by AWS. It should help you in setting up cross account put action.
Related
I have running app on auto-scaled ec2 env. of account1 created via AWS CDK (it also should have support to be run on multiple regions). During the app execution I need to get object from account2's s3.
One of the ways to get s3 data is use tmp credentials(via sts assume role):
on account1 side create a policy for ec2 instance role to assume sts tmp credentials for s3 object
on account2 side create a policy GetObject access to the s3 object
on account2 site create role and attach point2's policy to it + trust relationship to account1's ec2 role
Pros: no user credentials are required to get access to the data
Cons: after each env update requires manual permission configuration
Another way is to create a user in account2 with permission to get s3 object and put the credentials on account1 side.
Pros: after each env update doesn't require manual permission configuration
Cons: Exposes IAM user's credentials
Is there a better option to eliminate manual permission config and explicit IAM user credentials sharing?
You can add a Bucket Policy on the Amazon S3 bucket in Account 2 that permits access by the IAM Role used by the Amazon EC2 instance in Account 1.
That way, the EC2 instance(s) can access the bucket just like it is in the same Account, without have to assume any roles or users.
Simply set the Principal to be the ARN of the IAM Role used by the EC2 instances.
Cyberduck version: Version 7.9.2
Cyberduck is designed to access non-public AWS buckets. It asks for:
Server
Port
Access Key ID
Secret Access Key
The Registry of Open Data on AWS provides this information for an open dataset (using the example at https://registry.opendata.aws/target/):
Resource type: S3 Bucket
Amazon Resource Name (ARN): arn:aws:s3:::gdc-target-phs000218-2-open
AWS Region: us-east-1
AWS CLI Access (No AWS account required): aws s3 ls s3://gdc-target-phs000218-2-open/ --no-sign-request
Is there a version of s3://gdc-target-phs000218-2-open that can be used in Cyberduck to connect to the data?
If the bucket is public, any AWS credentials will suffice. So as long as you can create an AWS account, you only need to create an IAM user for yourself with programmatic access, and you are all set.
No doubt, it's a pain because creating an AWS account needs your credit (or debit) card! But see https://stackoverflow.com/a/44825406/1094109 and https://stackoverflow.com/a/44825406/1094109
I tried this with s3://gdc-target-phs000218-2-open and it worked:
For RODA buckets that provide public access to specific prefixes, you'd need to edit the path to suit. E.g. s3://cellpainting-gallery/cpg0000-jump-pilot/source_4/ (this is a RODA bucket maintained by us, yet to be released fully)
NOTE: The screenshots below show a different URL that's no longer operational. The correct URL is s3://cellpainting-gallery/cpg0000-jump-pilot/source_4/
No, it's explicitly stated in the documentation that
You must obtain the login credentials [in order to connect to Amazon S3 in Cyberduck]
Usually there is Compute Engine default service account that is created automatically by GCP, this account is used for example by VM agents to access different resources across GCP and by default has role/editor permissions.
Suppose I want to create GCS bucket that can only be accessed by this default service account and no one else. I've looked into ACLs and tried to add an ACL to the bucket with this default service account email but it didn't really work.
I realized that I can still access bucket and objects in this bucket from other accounts that have for example storage bucket read and storage object read permissions and I'm not sure what I did wrong (maybe some default ACLs are present?).
My questions are:
Is it possible to limit access to just that default account? In that case who will not be able to access it?
What would be the best way to do it? (would appreciate a lot an example using Storage API)
There are still roles such as role/StorageAdmin, and actually no matter what ACLs will be put on the bucket I could still access it if I had this role (or higher role such as owner) right?
Thanks!
I recommend you not to use ACL (and Google also). It's better to switch the bucket in uniform IAM policy.
There are 2 bad side of ACL:
New created files aren't ACL and you need to set it everytime that you create a ne file
It's difficult to know who has and who hasn't access with ACL. IAM service is better for auditing.
When you switch to Uniform IAM access, Owner, Viewer, and Editor role no longer have access to buckets (the role/storage.admin isn't included in this primitive role). It could solve in one click all the unwanted access. Else, as John said, remove all the IAM permission on the bucket and the project that have access to the bucket except your service account.
You can control access to buckets and objects using Cloud IAM and ACLs.
For example grant the service account WRITE (R: READ,W: WRITE,O: OWNER) access to the bucket using ACLs:
gsutil acl ch -u service-account#project.iam.gserviceaccount.com:W gs://my-bucket
To remove access of service account from the bucket:
gsutil acl ch -d service-account#project.iam.gserviceaccount.com gs://my-bucket
If There are roles such as role/StorageAdmin in the IAM identities (project level), they will have access to all the GCS resources of the project. You might have to change the permission to avoid them having access.
I have a usecase to use AWS Lambda to copy files/objects from one S3 bucket to another. In this usecase Source S3 bucket is in a separate AWS account(say Account 1) where the provider has only given us AccessKey & SecretAccess Key. Our Lambda runs in Account 2 and the destination bucket can be either in Account 2 or some other account 3 altogether which can be accessed using IAM role. The setup is like this due to multiple partner sharing data files
Usually, I used to use the following boto3 command to copy the contents between two buckets when everything is in the same account but want to know how this can be modified for the new usecase
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=destination_file_key)
How can the above code be modified to fit my usecase of having accesskey based connection to source bucket and roles for destination bucket(which can be cross-account role as well)? Please let me know if any clarification is required
There's multiple options here. Easiest is by providing credentials to boto3 (docs). I would suggest retrieving the keys from the SSM parameter store or secrets manager so they're not stored hardcoded.
Edit: I realize the problem now, you can't use the same session for both buckets, makes sense. The exact thing you want is not possible (ie. use copy_object). The trick is to use 2 separate session so you don't mix the credentials. You would need to get_object from the first account and put_object to the second objects. You should be able to simply put the resp['Body'] from the get in the put request but I haven't tested this.
import boto3
acc1_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc1,
aws_secret_access_key=SECRET_KEY_acc1
)
acc2_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc2,
aws_secret_access_key=SECRET_KEY_acc2
)
acc1_client = acc1_session.client('s3')
acc2_client = acc2_session.client('s3')
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
resp = acc1_client.get_object(Bucket=source_bucket_name, Key=source_file_key)
acc2_client.put_object(Bucket=destination_bucket_name, Key=destination_file_key, Body=resp['Body'])
Your situation appears to be:
Account-1:
Amazon S3 bucket containing files you wish to copy
You have an Access Key + Secret Key from Account-1 that can read these objects
Account-2:
AWS Lambda function that has an IAM Role that can write to a destination bucket
When using the CopyObject() command, the credentials used must have read permission on the source bucket and write permission on the destination bucket. There are normally two ways to do this:
Use credentials from Account-1 to 'push' the file to Account-2. This requires a Bucket Policy on the destination bucket that permits PutObject for the Account-1 credentials. Also, you should set ACL= bucket-owner-full-control to handover control to Account-2. (This sounds similar to your situation.) OR
Use credentials from Account-2 to 'pull' the file from Account-1. This requires a Bucket Policy on the source bucket that permits GetObject for the Account-2 credentials.
If you can't ask for a change to the Bucket Policy on the source bucket that permits Account-2 to read the contents, then **you'll need a Bucket Policy on the Destination bucket that permits write access by the credentials from Account-1`.
This is made more complex by the fact that you are potentially copying the object to a bucket in "some other account". There is no easy answer if you are starting to use 3 accounts in the process.
Bottom line: If possible, ask them for a change to the source bucket's Bucket Policy so that your Lambda function can read the files without having to change credentials. It can then copy objects to any bucket that the function's IAM Role can access.
I have a HDP cluster on AWS and I have one s3(in other account) also, my hadoop version is Hadoop 3.1.1.3.0.1.0-187
Now I want to read from the s3 (which is in different account) and process, then write the result to my s3(same account as cluster).
But as per the HDP guide Here tells, I can configure only one keys of either my account or other account.
But in my case I want to configure two account keys, so How to do do that ?
Due to some security reason, other account can not change the bucket policy to add IAM role which is created in my account , Hence I tried to access like below
Configured the keys of other account
Added IAM role(which has access policy for my bucket) of my account
but Still I got below error when I tried to access my account s3 from spark write
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3
What you need is to use the EC2 instance profile role. It is an IAM role that is attached to your instance: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html
You first create a role with permissions that allow s3 access. Then you attach that role to your HDP cluster(EC2 autoscaling group and EMR can both achieve that).No IAM access key configuration needed on your side, although AWS still does that for you in the background. This is the s3 "outbound" access part.
The 2nd step is to set up the bucket policy to allow cross-account access: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-walkthroughs-managing-access-example2.html
You will need to do this for each bucket in your different accounts. This is basically the "inbound" s3 access permission part.
You will encounter 400 if any part of your access(i.e., your instance profile role's permission, S3 bucket ACL, bucket policy, public access block setting and etc..) is denied in the permission chain. There are much more layers on the "inbound" side. So to start to get things working, if you are not IAM expert, try to start with a very open policy(use '*' wildcard) and then narrow things down.
If I've understood right
you want your EC2 VMs to access an S3 bucket to which the IAM role doesn't have access
your have a set of AWS login details for the external S3 bucket (login and password)
HDP3 has an default auth chain of, in order
per-bucket secrets. fs.s3a.bucket.NAME.access.key, fs.s3a.bucket.NAME.secret.key
config-wide secrets fs.s3a.access.key, fs.s3a.secret.key
env vars AWS_ACCESS_KEY and AWS_SECRET_KEY
the IAM Role (it does an HTTP GET to the 169.something server which serves up a new set of IAM role credentials at least once an hour)
What you need to try here is set up some per-bucket secrets for only the external source (either in a JCEKS file on all nodes in core-site.xml, or in the spark default. For example, if the external bucket was s3a://external, you'd have
spark.hadoop.fs.s3a.bucket.external.access.key AKAISOMETHING spark.hadoop.fs.s3a.bucket.external.secret.key SECRETSOMETHING
HDP3/Hadoop 3 can handle >1 secret in the same JCEKS file without problems. HADOOP-14507. my code. Older versions let you put username:secret in the URI, but that's such a security troublespot (everything logs those URIs as they aren't viewed as sensistive), that feature has been cut from Hadoop now. Stick to the JCEKs file with a per-bucket secret, falling back to IAM role for your own data
Note you can fiddle with the authentication list for ordering and behaviour: if you add use the TemporaryAWSCredentialsProvider then it'll support session keys as well, which is often handy.
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>
org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider,
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
com.amazonaws.auth.EnvironmentVariableCredentialsProvider,
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider
</value>
</property>