AWS Batch job getting Access Denied on S3 despite user role - amazon-web-services

I am deploying my first batch job on AWS. When I run my docker image in an EC2 instance, the script called by the job runs fine. I have assigned an IAM role to this instance to allow S3 access.
But when I run the same script as a job on AWS Batch, it fails due to Access Denied errors on S3 access. This is despite the fact that in the Job Definition, I assign an IAM role (created for Elastic Container Service Task) that has full S3 access.
If I launch my batch job with a command that does not access S3, it runs fine.
Since using an IAM role for the job definition seems not to be sufficient, how then do I grant S3 permissions within a Batch Job on AWS?
EDIT
So if I just run aws s3 ls interlinked as my job, that also runs properly. What does not work is running the R script:
library(aws.s3)
get_bucket("mybucket")[[1]]
Which fails with Access Denied.
So it seems the issue is either with the aws.s3 package or, more likely, my use of it.

The problem turned out to be that I had IAM Roles specified for both my compute environment (more restrictive) and my jobs (less restrictive).
In this scenario (where role based credentials are desired), the aws.s3 R package uses aws.signature and aws.ec2metadata to pull temporary credentials from the role. It pulls the compute environment role (which is an ec2 role), but not the job role.
My solution was just to grant the required S3 permissions to my compute environment's role.

Related

AWS Boto3/Botocore assume IAM role in ECS task

I'm trying to create a botocore session (that does not use my local AWS credentials on ~/.aws/credentials). In other words, I want to create a "burner AWS account". With that burner credentials/session, I want to setup an STS client and with that client, assume a role in order to access a DynamoDB database. Can someone provide some example code which accomplishes exactly this?
Because if I want my system to go into production environment, I CANNOT store the AWS credentials on Github because AWS will scan for it. I'm trying to implement a workaround such that we don't have to store ~/.aws/credentials file on Github.
The running a task in Amazon ECS, simply assign an IAM Role to the task.
Amazon ECS will then generate temporary credentials for that IAM Role. Any code that uses an AWS SDK (such as boto3 for Python) knows how to access those credentials via the metadata service.
The result is that your code using boto3 will automatically receive credentials that have the permissions associated with the IAM Role assigned to the task.
See: IAM roles for tasks - Amazon Elastic Container Service

How to fix expired token in AWS s3 copy command?

I need to run the command aws s3 cp <filename> <bucketname> from an EC2 RHEL instance to copy a file from the instance to an S3 bucket.
When I run this command, I receive this error: An error occurred (ExpiredToken) when calling the PutObject operation: The provided token has expired
I also found that this same error occurs when trying to run many other CLI commands from the instance.
I do not want to change my IAM role because the command was previously working perfectly fine and IAM policy changes must go through an approval process. I have double checked the IAM role the instance is assuming and it still contains the correct configuration for allowing PutObject on the correct resources.
What can I do to allow AWS CLI commands to work again in my instance?
AWS API tokens are time-sensitive, and VMs in the cloud tend to suffer from clock drift.
Check that time is accurate on the RHEL instance, and use ntp servers to make sure any drift is regularly corrected.

boto3 can't connect to S3 from Docker container running in AWS batch

I am attempting to launch a Docker container stored in ECR as an AWS batch job. The entrypoint python script of this container attempts to connect to S3 and download a file.
I have attached a role with AmazonS3FullAccess to both the AWSBatchServiceRole in the compute environment and I have also attached a role with AmazonS3FullAccess to the compute resources.
This is the following error that is being logged: botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://s3.amazonaws.com/"
There is a chance that these instances are being launched in a custom VPC, not the default VPC. I'm not sure this makes a difference, but maybe that is part of the problem. I do not have appropriate access to check. I have tested this Docker image on an EC2 instance launched in the same VPC and everything works as expected.
You mentioned compute environment and compute resources. Did you add this S3 policy to the Job Role as mentioned here?
After you have created a role and attached a policy to that role, you can run tasks that assume the role. You have several options to do this:
Specify an IAM role for your tasks in the task definition. You can create a new task definition or a new revision of an existing task definition and specify the role you created previously. If you use the console to create your task definition, choose your IAM role in the Task Role field. If you use the AWS CLI or SDKs, specify your task role ARN using the taskRoleArn parameter. For more information, see Creating a Task Definition.
Specify an IAM task role override when running a task. You can specify an IAM task role override when running a task. If you use the console to run your task, choose Advanced Options and then choose your IAM role in the Task Role field. If you use the AWS CLI or SDKs, specify your task role ARN using the taskRoleArn parameter in the overrides JSON object. For more information, see Running Tasks.

AWS EC2: Do I need to run "aws configure" to assign access keys to if my instance already has an IAM role?

I'm fairly new to AWS. I'm setting up an EC2 instance (an Ubuntu 18.04 LAMP server).
I've installed the aws CLI on the instance, so I can automate EBS snapshots for backup.
I've also created an IAM role with the needed permissions to run aws ec2 create-snapshot, and I've assigned this role to my EC2 instance.
My question: is there any need to run aws configure on the EC2 instance, in order to set the AWS Access Key ID and AWS Secret Access Key? I'm still wrapping my head around AWS IAM roles – but (since the EC2 instance has a role), it sounds like the instance will acquire the needed keys from IAM automagically. Therefore, I assume that there's never any need to run aws configure. (In fact, it seems like this would be counterproductive, since the keys set via aws configure would override the keys acquired automatically via the role.)
Is all of that accurate?
No, the AWS CLI will progress through a list of credential providers. The instance metadata service will eventually be reached, even if you have not configured the AWS cli:
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#config-settings-and-precedence
And yes, if you add keys to the AWSCLI config file, they will be used with higher priority than those obtained from the instance metadata service.

Using IAM roles transitively

I have a question on using IAM roles with EC2 and EMR. Here's my current setup:
I have a EC2 machine launched with a particular IAM role (let's call this role 'admin'). My workflow is to upload a file to S3 from this machine and then create an EMR cluster with a particular IAM role (a 'runner' role). The EMR cluster works on the file uploaded to S3 from the admin machine.
Admin is a role with privileges to all APIs in all AWS services. Runner has access to all APIs in EMR, EC2 and S3.
For some reason, the EMR cluster is unable to access the input file loaded in S3. It keeps getting an 'access denied' exception from s3.
I guess writing to s3 from one IAM role and reading it from a different IAM role is what is causing the issue.
Any ideas on what is going wrong here or whether this is even a supported use-case is appreciated.
Thanks!
http://blogs.aws.amazon.com/security/post/TxPOJBY6FE360K/IAM-policies-and-Bucket-Policies-and-ACLs-Oh-My-Controlling-Access-to-S3-Resourc
S3 objects are protected in three ways as seen in the post I linked to.
Your IAM role will need the permission to read S3 objects.
The S3 bucket policy must allow your IAM role access to the object.
The S3 ACL for the specific object must also allow your IAM role access to the object.