EMR Serverless Read From S3 Cross Account - amazon-web-services

I have an EMR Application that runs inside a VPC in a Private Subnet with NAT on.
This application can read files from the buckets in the same account, but when it tries to read a file in a bucket in other aws account it gets access denied.
The emr applications has the following policy:
S3FullAccess:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:*",
"s3-object-lambda:*"
],
"Resource": "*"
}
]
}
EMR Spark packages settings:
--conf spark.jars.packages=org.apache.hadoop:hadoop-aws:3.2.0
Job Spark:
from pyspark.sql import SparkSession
spark = (SparkSession.builder
.config("spark.hadoop.fs.s3a.fast.upload", True)
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
.config("spark.sql.adaptive.enabled", "true")
.config("spark.jars.packages", "org.apache.hadoop:hadoop-aws:3.2.0")
.enableHiveSupport().getOrCreate()
)
df_from_another_s3_account_bucket = spark.read.parquet('s3a://bucket_account_b/path/file.parquet')
This execution is returning the error:
java.nio.file.AccessDeniedException: s3a://bucket_account_b/path/file.parquet: getFileStatus on s3a://bucket_account_b/path/file.parquet: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
I have tried using credentials in the spark configs:
.config("fs.s3a.access.key", key_bucket_account_b)
.config("fs.s3a.secret.key", secret_bucket_account_b)
I have also tried creating a bucket policy allowing the emr aws account, but still got the same error.
{
"Version": "2012-10-17",
"Id": "Policy1543283",
"Statement": [
{
"Sid": "Stmt1412820423",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::emr_aws_account:root"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::bucket_account_b"
}
]
}
What else could I try?

Related

How to setup terraform state on encrypted s3 bucket

I have setup an s3 backend for terraform state following this excellent answer by Austin Davis. I followed the suggestion by Matt Lavin to add a policy encrypting the bucket.
Unfortunately that bucket policy means that the terraform state list now throws the
Failed to load state: AccessDenied: Access Denied status code: 403, request id: XXXXXXXXXXXXXXXX, host id: XXXX...
I suspect I'm missing either passing or configuring something on the terraform side to encrypt the communication or an additional policy entry to be able to read the encrypted state.
This is the policy added to the tf-state bucket:
{
"Version": "2012-10-17",
"Id": "RequireEncryption",
"Statement": [
{
"Sid": "RequireEncryptedTransport",
"Effect": "Deny",
"Action": ["s3:*"],
"Resource": ["arn:aws:s3:::${aws_s3_bucket.terraform_state.bucket}/*"],
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
},
"Principal": "*"
},
{
"Sid": "RequireEncryptedStorage",
"Effect": "Deny",
"Action": ["s3:PutObject"],
"Resource": ["arn:aws:s3:::${aws_s3_bucket.terraform_state.bucket}/*"],
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
},
"Principal": "*"
}
]
}
I would start by removing that bucket policy, and just enable the newer default bucket encryption setting on the S3 bucket. If you still get access denied after doing that, then the IAM role you are using when you run Terraform is missing some permissions.

InsufficientS3BucketPolicyFault when enabling AWS Redshift audit logging through Terraform

Problem
I'm trying to enable audit logging on an AWS redshift cluster. I've been following the instructions provided by AWS here: https://docs.aws.amazon.com/redshift/latest/mgmt/db-auditing.html#db-auditing-enable-logging
Current Configuration
I've defined the relevant IAM role as follows
resource "aws_iam_role" "example-role" {
name = "example-role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "redshift.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
And have granted the following IAM permissions to the example-role role:
{
"Sid": "AllowAccessForAuditLogging",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetBucketAcl"
],
"Resource": [
"arn:aws:s3:::example-bucket",
"arn:aws:s3:::example-bucket/*"
]
},
The relevant portion of the redshift cluster configuration is as follows:
resource "aws_redshift_cluster" "example-cluster-name" {
cluster_identifier = "example-cluster-name"
...
# redshift audit logging to S3
logging {
enable = true
bucket_name = "example-bucket-name"
}
master_username = var.master_username
iam_roles = [aws_iam_role.example-role.arn]
...
Error
terraform plan runs correctly, and produces the expected plan based on the above configuration. However, when running terraform apply the following error occurs:
Error: error enabling Redshift Cluster (example-cluster-name) logging: InsufficientS3BucketPolicyFault: Cannot read ACLs of bucket example-bucket-name. Please ensure that your IAM permissions are set up correctly.
note: i've replaced all resource identifiers with example-* resource names and identifiers.
#shimo's answer is correct. I just detail for someone like me
Redshift has full access to S3. But you need add bucket policy too. ( S3's permission)
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::361669875840:user/logs"
},
"Action": [
"s3:GetBucketAcl",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::<your-bucket>",
"arn:aws:s3:::<your-bucket>/*"
]
}
- `361669875840` is match with your region check [here][1]
[1]: https://github.com/finos/compliant-financial-infrastructure/blob/main/aws/redshift/redshift_template_public.yml

Uploading to AWS S3 bucket from a profile in a different environment

I have access to one of two AWS environments and I've created a protected S3 bucket in it to upload files to from an account in the one that I do not. The environment and the account that I don't have access to are what a project's CI uses.
environment I have access to: env1
environment I do not have access to: env2
account I do not have access to: user/ci
bucket name: content
S3 bucket policy:
{
"Version": "2008-10-17",
"Id": "PolicyForCloudFrontPrivateContent",
"Statement": [
{
...
},
{
"Sid": "Allow access to bucket from profile in env1",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111122223333:user/ci"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket*"
],
"Resource": "arn:aws:s3:::content"
},
{
"Sid": "Allow access to bucket items from profile in env1",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111122223333:user/ci"
},
"Action": [
"s3:Get*",
"s3:PutObject",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::content",
"arn:aws:s3:::content/*"
]
}
]
}
From inside a container that's configured for env1 and user/ci I'm testing with the command
aws s3 sync content/ s3://content/
and I get the error:
fatal error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
I have two questions:
Am I even using the correct aws command to upload the data to the bucket?
Am I missing something from my bucket policy?
For the latter, I've basically followed what a load of examples and answers online have suggested.
To test your policy, I did the following:
Created an IAM User with no policies
Created an Amazon S3 bucket
Attached your Bucket Policy to the bucket, and updated the ARN and bucket name
Tested access to the bucket with:
aws s3 ls s3://bucketname
aws s3 sync folder/ s3://bucketname/folder/
It worked fine.
Therefore, the policy you display appears to be giving all necessary permissions. It is possible that you have something else that is Denying access on the bucket.
The solution was to given the ACL
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::content",
"arn:aws:s3:::content/*"
]
}
]
}
to user/ci in env1.

S3 Access Denied when getting objects from CloudTrail S3 bucket

I use the cloudtrail bucket to make Athena queries and I keep getting this error:
Your query has the following error(s):
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: A57070510EFFB74B; S3 Extended Request ID: v56qfenqDD8d5oXUlkgfExqShUqlxwRwTQHR1S0PmHpp7WH+cz0x8D2pPLPkRoGz2o428hmOV1U=), S3 Extended Request ID: v56qfenqDD8d5oXUlkgfExqShUqlxwRwTQHR1S0PmHpp7WH+cz0x8D2pPLPkRoGz2o428hmOV1U= (Path: s3://cf-account-foundation-cloudtrailstack-trailbucket-707522222211/AWSLogs/707522222211/CloudTrail/ca-central-1/2019/01/11/707522222211_CloudTrail_ca-central-1_20190111T0015Z_XE4JGGZLQTNS334S.json.gz)
This query ran against the "default" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 56a188d5-9a10-4c30-a701-42c243c154c6.
The query is:
SELECT * FROM "default"."cloudtrail_table_logs" limit 10;
This is the S3 bucket Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AWSCloudTrailAclCheck20150319",
"Effect": "Allow",
"Principal": {
"Service": "cloudtrail.amazonaws.com"
},
"Action": "s3:GetBucketAcl",
"Resource": "arn:aws:s3:::cf-account-foundation-cloudtrailstack-trailbucket-707522222211"
},
{
"Sid": "AWSCloudTrailWrite20150319",
"Effect": "Allow",
"Principal": {
"Service": "cloudtrail.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::cf-account-foundation-cloudtrailstack-trailbucket-707522222211/AWSLogs/707522222211/*",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
]
}
The S3 bucket is in the region eu-central-1 (frankfurt) same as the athena table from which I make the queries.
I have administrator permissions on my IAM User.
I get the same error when I manually try to open a file in this bucket:
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>5D49DF767D01F32C</RequestId><HostId>9Vd/MvDy5/AJYExs6BXoZbuMxxjxTCfFqzaMTQDwyrgyVZpdL+AgDihiZu3k17PWEYOJ19I8sbQ=</HostId></Error>
I don't know what is going on here. One more precision is that the bucket has SSE-KMS encryption but that does not mean we can't make queries into it.
Same error even when I put the bucket on public.
Anyone has a clue?

aws s3 command responds with 403 forbidden

Trying to install AWS CodeDeploy agent on my EC2 instance
aws s3 cp s3://aws-codedeploy-ap-southeast-2/latest/install . --region ap-southeast-2
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
The IAM Role for the instance has Policy Document
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:Get*",
"s3:List*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
and Trust relationship
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "codedeploy.ap-southeast-2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
I followed the guideline at
http://docs.aws.amazon.com/codedeploy/latest/userguide/codedeploy-agent-operations-install-linux.html
Also I attached AdministratorGroup Policy to my user.
Code Deploy agent is now running in my box.
That command is not correct. cpis used to upload something to S3, to download a file you could use curl or wget:
curl -O https://aws-codedeploy-ap-southeast-2.s3.amazonaws.com/latest/install
or
wget https://aws-codedeploy-ap-southeast-2.s3.amazonaws.com/latest/install