Can I use a S3 endpoint for Amazon Athena? - amazon-web-services

I would like to know if it is possible to configure an endpoint to S3 (S3 endpoint) for AWS Athena, not the VPC endpoint. I have looked at it everywhere in the documentation I could not find it. Is this even possible?
The idea is to use the endpoint to get to S3 for all the Athena queries.
Thanks and best regards
Krishna

An --endpoint-url is normally used to override how the AWS CLI access an AWS service.
I see it used when people use an S3-compatible service such as Wasabi, where they are pointing to a different service rather than the 'real' S3.
Amazon Athena knows how to connect directly to Amazon S3. It is not possible to override the S3 Endpoint when Athena connects to S3.

Related

Does data in Amazon S3 go on public internet when i use job glue?

I'm using AWS services to create a datapipeline
I have data stored in an Amazon S3 bucket and I plan to use the glue crawler to crawl the data under a prefix to extract the metadata and after a glue job to do ETL and save the data in another bucket.
My question is : in which network the services works and communicates each other? it is possible that the data will be moved from Amazon S3 to glue through the public internet?
is there any link to aws documentation that explain which networks AWS services uses when they transfer data between them?
You need to grand explicit permission to any resource to be able access your S3 bucket.
AIM Roles. Using policy create a role and attach that role to AWS resource.
Bucket Policy is another mechanism to grant access.
By default everything is private, you need to grant access otherwise No is not accessible from the internet.

AWS website with only S3 and RDS. Is it possible without EC2?

I have a task where I had to check whether if it is possible to serve a secure website where the
content is served out from S3 and dynamic data is served out from RDS.
Is it possible to do this job, or do I need EC2 instances as well?
Thanks for helping me,
Yes, this is possible - static assets (html/js/css/images) all stored on s3, cloudfront distribution pointing to your s3 location, an api gateway layer to act as the endpoints for your api calls - those api endpoints call aws lambda functions, and then some custom aws lambda code to perform the actual rds queries - and authentication done by aws cognito.
All this can be done without ec2.

AWS EMR Apache Spark and custom S3 endpoint in VPC

I use Apache Spark and Redshift in VPС and also use AWS S3 for source data and temp data for Redshift COPY.
Right now I suspect that performance of read/write from/to AWS S3 is not good enough and based on the suggestion in the following discussion https://github.com/databricks/spark-redshift/issues/318 I have created S3 endpoint within the VPC. Right now I can't see any performance difference before and after S3 endpoint creation when I'm loading data from S3.
In Apache Spark I read data in the following way:
spark.read.csv("s3://example-dev-data/dictionary/file.csv")
Do I need to add/configure some extra logic/configuration on AWS EMR Apache Spark in order to proper use of AWS S3 endpoint?
The S3 VPC Endpoint is a Gateway Endpoint so you have to put a new entry in the routing table of your subnets where you start EMR clusters that route the traffic to the endpoint.

Aws s3 access using vpc-endpoint

Suppose I create a vpc and a vpc-endpoint in region1.
Can I communicate to an s3-bucket-in-region2 using this vpc-endpoint, i.e. without using the internet?
No, VPC endpoints to not support cross region requests. Your bucket(s) need to be in the same region as the VPC.
Endpoints for Amazon S3
Endpoints currently do not support cross-region requests—ensure that
you create your endpoint in the same region as your bucket. You can
find the location of your bucket by using the Amazon S3 console, or by
using the get-bucket-location command. Use a region-specific Amazon S3
endpoint to access your bucket; for example,
mybucket.s3-us-west-2.amazonaws.com. For more information about
region-specific endpoints for Amazon S3, see Amazon Simple Storage
Service (S3) in Amazon Web Services General Reference.

Limit aws to just s3

Is there a way to disable access to all aws services, but s3? I have an account that will only use s3 and I am worried about unexpected charges from running ec2.
Alternatively, is there a way to create a api keys for s3 access only?
You could easily create an IAM user and allow (maybe) full permissions to S3 and all other services just read only access. In that way even using api keys, he can only use s3 and cant create any other resources in any other services.