Suppose I create a vpc and a vpc-endpoint in region1.
Can I communicate to an s3-bucket-in-region2 using this vpc-endpoint, i.e. without using the internet?
No, VPC endpoints to not support cross region requests. Your bucket(s) need to be in the same region as the VPC.
Endpoints for Amazon S3
Endpoints currently do not support cross-region requests—ensure that
you create your endpoint in the same region as your bucket. You can
find the location of your bucket by using the Amazon S3 console, or by
using the get-bucket-location command. Use a region-specific Amazon S3
endpoint to access your bucket; for example,
mybucket.s3-us-west-2.amazonaws.com. For more information about
region-specific endpoints for Amazon S3, see Amazon Simple Storage
Service (S3) in Amazon Web Services General Reference.
Related
For e.g.: -
Consider a scenario where I have a back-end service which takes dynamic data from RDS and static data (Audio/Video/pdf) from S3 Bucket.
Back-End Service is deployed over an EC2 instance which internally uses AWS SDK to fetch static data from S3 Bucket. Below is the flow:
User Request Data ---> AWS Route 53 ---? ALB ---> Target EC2 Instance ---> Fetch Data from S3 Bucket.
Based on the above scenario if a user request is always going to route to EC2 Instance and EC2 Instance and S3 are in the same region then is there any need of configuring CloudFront in the flow?
Yes I strongly recommend to use CLoudfront with s3 for your static dat.
In fact this is one of the primary use case. This will also give you advantage not only in terms of latency and cost but also in terms of security because you can choose who can access content from your S3 using OAI ( origin access identity )
If you want to know more and understand how cloudfront can help you here is a dedicated blog from aws on this use case -> https://aws.amazon.com/blogs/networking-and-content-delivery/amazon-s3-amazon-cloudfront-a-match-made-in-the-cloud/
I want to access the data residing in a AWS s3 bucket from GCP Cloud Composer Environment's Service Account.
I followed this link. But this also uses the key creation inside.
Is there a way to connect to AWS S3 from GCP via roles only?
I have a lambda function accessing a S3 bucket using aws-sdk
There are a high number of operations(requests) to the S3 bucket, which is increasing considerably the cost to use lambda
I was hoping that the requests use the s3:// protocol but there are going over the internet
I understand that one solution could be:
Attach the Lambda to a VPC
Create a VPC endpoint to S3
Update the route tables of the VPC
Is there a simpler way to do so?
An alternative could be creating an API Gateway, and creating lambda proxy method integration following the AWS Guide or Tutorial.
You can then configure your apigateway to act as your external facing integration over the internet and your lambda / s3 stays within AWS.
The traffic won't go over the internet and incur additional data transfer cost as long as the non-VPC lambda function is executing in the same region as the S3 bucket. So VPC is not needed in this case.
https://aws.amazon.com/s3/pricing/
You pay for all bandwidth into and out of Amazon S3, except for the following:
• Data transferred in from the internet.
• Data transferred out to an Amazon Elastic Compute Cloud (Amazon EC2) instance, when the instance is in the same AWS Region as the S3 bucket.
• Data transferred out to Amazon CloudFront (CloudFront).
You can think of lambda as ec2. So the data transfer is free but be careful you still need to pay for api request.
I use Apache Spark and Redshift in VPС and also use AWS S3 for source data and temp data for Redshift COPY.
Right now I suspect that performance of read/write from/to AWS S3 is not good enough and based on the suggestion in the following discussion https://github.com/databricks/spark-redshift/issues/318 I have created S3 endpoint within the VPC. Right now I can't see any performance difference before and after S3 endpoint creation when I'm loading data from S3.
In Apache Spark I read data in the following way:
spark.read.csv("s3://example-dev-data/dictionary/file.csv")
Do I need to add/configure some extra logic/configuration on AWS EMR Apache Spark in order to proper use of AWS S3 endpoint?
The S3 VPC Endpoint is a Gateway Endpoint so you have to put a new entry in the routing table of your subnets where you start EMR clusters that route the traffic to the endpoint.
Can i connect to different account AWS services(s3, dynamoDb) from my account ec2 using VPC Endpoint?
Amazon S3 and Amazon DynamoDB are accessed on the Internet via API calls.
When a call is made to these services, a set of credentials is provided to identify the account and user.
If you wish to access S3 or DynamoDB resources belonging to a different account, you simply need to use credentials that belong to the target account. The actual request can be made from anywhere on the Internet (eg from Amazon EC2 or from a computer under your desk) — the only things that matters is that you have valid credentials linked to the desired AWS account.
There is no need to manipulate VPC configurations to access resources belonging to a different AWS Account. The source of the request is actually irrelevant.