Access Denied error on running athena query from root account - amazon-web-services

I am getting an access denied error when I try to run athena query from root account. what am I doing wrong?
I have tried to create IAM user roles, but not sure if I am doing right. I just wanted to do a quick test.
Create s3 bucket -> upload csv -> go to athena -> pull data from s3 -> run query
Error that I am getting is:
Your query has the following error(s):
Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: BF8CDA860116C79B; S3 Extended Request ID: 825bTOZNiWP1bUJUGV3Bg5NSzy3ywqZdoBtwYItrxQfr8kqDpGP1RBIHR6NFIBySgO/qIKA8/Cw=)
This query ran against the "sampledb" database, unless qualified by the query. Please post the error message on our forum or contact customer support with
Query Id: c08c11e6-e049-46f1-a671-0746da7e7c84.
What am I doing wrong. I just wanted to do a quick test

If executing query from AWS Athena Web console, Ensure you have access to S3 bucket location of table.you can extract location from SHOW CREATE TABLE command.

Related

AWS Glue Spark job failing on DataFrame persist()

I have an AWS Glue Spark job that fails with the following error:
An error occurred while calling o362.cache. com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: ...; S3 Extended Request ID: ...; Proxy: null), S3 Extended Request ID: ...
I believe the error is thrown at line where the Spark persist() method is called on a DataFrame. The Glue job is assigned an IAM role that has full S3 access (all locations/operations allowed), yet I'm still getting the S3 exception. I tried setting the "Temporary path" for the Glue job on the AWS Console to a specific S3 bucket with full access, I also tried setting the Spark temporary directory to a specific S3 bucket with full access via:
conf = pyspark.SparkConf()
conf.set('spark.local.dir', 's3://...')
self.sc = SparkContext(conf=conf)
which didn't help. It's very strange that the job is failing even with full S3 access. Not sure what to try next, any help would be really appreciated. Thank you!

How to write log and data in Druid Deep Storage in AWS S3

We have a druid cluster setup and now i am trying to write the indexing-logs and data into S3 deep storage.
Following are the details
druid.storage.type=s3
druid.storage.bucket=bucket-name
druid.storage.baseKey=druid/segments
# For S3:
druid.indexer.logs.type=s3
druid.indexer.logs.s3Bucket=your-bucket
druid.indexer.logs.s3Prefix=druid/indexing-logs
After running ingestion task i am getting below error
*Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: HCAFAZBA85QW14Q0; S3 Extended Request ID: 2ICzpVAyFcy/PLrnsUWZBJwEo7dFl/S2lwDTMn+v83uTp71jlEe59Q4/vFhwJU5/WGMYramdSIs=; Proxy: null*)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-core-1.12.37.jar:?]
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-core-1.12.37.jar:?]
I tried to add the IAM role instance to the bucket level and same Role is running EC2 where Druid services are running.
Cam someone please guide what are the steps i am missing here.
I got it done!
I have created a new IAM role and created a policy where i have given permission to S3 bucket and subfolder
NOTE: Permission to S3 bucket is must
Example: If bucket name is "Bucket_1" and subfolder where Deep storage is configured is "deep_storage"
then make sure we should give permisson like:
**"arn:aws:s3:::Bucket_1"
"arn:aws:s3:::Bucket_1/*"**
I was missing with not giving to Bucket level permission and directly trying to give permission to sub folder level.
Also remove or comment out the below parameter from common.runtime.properties file from each servers of your Druid cluster
**druid.s3.accessKey=
druid.s3.secretKey=**
After this config I can see the data is getting successfully to S3 deep storage with IAM role and not with Secret & Access Key.

Jenkins AWS Steps authentication error accessing s3

Background
I am attempting to upload a file to an AWS S3 bucket in Jenkins. I am using the steps/closures provided by the AWS Steps plugin. I am using an Access Key ID and an Access Key Secret and storing it as a username and password, respectively, in Credential Manager.
Code
Below is the code I am using in a declarative pipeline script
sh('echo "test" > someFile')
withAWS(credentials:'AwsS3', region:'us-east-1') {
s3Upload(file:'someFile', bucket:'ec-sis-integration-test', acl:'BucketOwnerFullControl')
}
sh('rm -f someFile')
Here is a screenshot of the credentials as they are stored globally in Credential Manager.
Issue
Whenever I run the pipeline I get the following error
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 5N9VEJBY5MDZ2K0W; S3 Extended Request ID: AJmuP635cME8m035nA6rQVltCCJqHDPXsjVk+sLziTyuAiSN23Q1j5RtoQwfHCDXAOexPVVecA4=; Proxy: null), S3 Extended Request ID: AJmuP635cME8m035nA6rQVltCCJqHDPXsjVk+sLziTyuAiSN23Q1j5RtoQwfHCDXAOexPVVecA4=
Does anyone know why this isn't working?
Trouble Shooting
I have verified the Access Key ID and Access Key Secret combination works by testing it out through a small Java application I wrote. Additionally I set the id/secret via Java system properties ( through the script console ), but still get the same error.
System.setProperty("aws.accessKeyId", "<KEY_ID>")
System.setProperty("aws.secretKey", "<KEY_SECRET>")
I also tried to change the credential manager type from username/password to aws credentials as seen below. It made no difference
it might be a bucket and object ownership issue. check if the credentials you use allow you to upload to the bucket ec-sis-integration-test.

Presto cannot access AWS S3 data

I'm trying to connect my local Presto to AWS Glue for metadata and S3 for data. I'm able to connect to Glue and do show tables; and desc <table>;. However, it's giving me this error when I do select * from <table>;
Query <query id> failed: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: <id>; S3 Extended Request ID: <id>)
My hive.properties file looks like this
connector.name=hive-hadoop2
hive.metastore=glue
hive.metastore.glue.region=<region>
hive.s3.use-instance-credentials=false
hive.s3.aws-access-key=<access key>
hive.s3.aws-secret-key=<secret key>
The error says the credentials are not recognized as valid. Since you can connect to Glue, it seems your environment or ~/.aws has some valid credentials. You should be able to utilize those credentials for S3 access as well.
For this, make sure you are using Presto 332 or better and remove hive.s3.use-instance-credentials, hive.s3.aws-access-key, hive.s3.aws-secret-key from your settings.

Spark Redshift: error while reading redshift tables using spark

I am getting below error while reading data from redshift table using spark.
Below is the code:
Dataset<Row> dfread = sql.read()
.format("com.databricks.spark.redshift")
.option("url", url)
//.option("query","select * from TESTSPARK")
.option("dbtable", "TESTSPARK")
.option("forward_spark_s3_credentials", true)
.option("tempdir","s3n://test/Redshift/temp/")
.option("sse", true)
.option("region", "us-east-1")
.load();
error:
Exception in thread "main" java.sql.SQLException: [Amazon](500310) Invalid operation: Unable to upload manifest file - S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid=,CanRetry 1
Details:
error: Unable to upload manifest file - S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid 6FC2B3FD56DA0EAC,ExtRid I,CanRetry 1
code: 9012
context: s3://jd-us01-cis-machine-telematics-devl-data-
processed/Redshift/temp/f06bc4b2-494d-49b0-a100-2246818e22cf/manifest
query: 44179
Can any one please help?
You're getting a permission error from S3 when Redshift tries to access the files you're telling it to load.
Have you configured the access keys for S3 access before calling the load()?
sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "ASDFGHJKLQWERTYUIOP")
sc.hadoopConfiguration.set("fs.s3.awsSecretAccessKey", "QaZWSxEDC/rfgyuTGBYHY&UKEFGBTHNMYJ")
You should be able to check which access key id was used from the Redshift side by querying the stl_query table.
From the error "S3ServiceException:Access Denied"
It seems the permission is not set for Redshift to Access the S3 files. Please follow the below steps
Add a bucket policy to that bucket that allows the Redshift Account
access Create an IAM role in the Redshift Account that redshift can
assume Grant permissions to access the S3 Bucket to the newly created role
Associate the role with the Redshift cluster
Run COPY statements