We have a hybrid model with on-premise connected to AWS via a site-to-site VPN. There is a need to download data from s3 to on-premise in a way that the traffic will go from on-premise to AWS and back without going to the open Internet for security considerations. I.e. similar to this:
on-prem --VPN--> AWS private subnet --> s3 endpoint --> s3
This schema works with interface endpoints since they generate private DNS names which can be used to call from on-premise, but the s3 endpoint is a gateway endpoint, not an interface endpoint, so it doesn't generate private DNS names.
How can this be achieved?
In February 2021, AWS released S3 PrivateLink Interface Endpoints which are different to the S3 Gateway Endpoints.
The difference is that S3 Interface Endpoints resolve to private VPC IP addresses and are routeable from outside the VPC (e.g via VPN, Direct Connect, Transit Gateway etc). S3 Gateway Endpoints use public IP ranges and are only routeable from resources within the VPC.
Interface Endpoints mean you can route to S3 buckets from your on-premise network via your VPN and one or more subnets without needing a proxy in the VPC, and without traversing the public Internet.
Refer to the blog announcement and the S3 privatelink user guide for more details.
According to the VPC endpoints documentation S3 doesn't provide direct access through a VPN:
Endpoint connections cannot be extended out of a VPC. Resources on the other side of a VPN connection, a VPC peering connection, an AWS Direct Connect connection, or a ClassicLink connection in your VPC cannot use the endpoint to communicate with resources in the endpoint service.
However, you could route the Amazon S3 IP address ranges through your VPN connection to the VPC and explicitly allow access to the S3 buckets for your VPN's public IP addresses in the bucket policy and deny everything else.
Please note that the Amazon S3 IP address ranges are subject to change.
Related
I have VPC with couple of subnets containing EC2 instances.
The EC2 instances have code that invokes various AWS services like dybamodb.
Is the connection from EC2 to AWS Service (like dynamodb) happening within the AWS Network, or via public internet?
Is there any way to control this?
Is the connection from EC2 to AWS Service (like dynamodb) happening within the AWS Network, or via public internet?
Technically the process on EC2 would be hitting the AWS DynamoDB public API which is on the Internet. The traffic would be routed through the Internet Gateway you have attached to the VPC. I think if it is all in the same region it may not actually leave the AWS data center, and you could try testing that via tools like traceroute, but I don't think there are any guarantees of that.
Is there any way to control this?
Yes, add a VPC Endpoint to your VPC for the service you want to connect to. Then the DNS server in your VPC will route all traffic to that service over the VPC Endpoint, instead of routing it to your VPC's Internet Gateway. The traffic will then be guaranteed to stay within the AWS network.
I have a static website setup in S3 with a bucket policy that denies access to the website (simple index.html) unless it is from a VPC Endpoint. I configured the VPC Endpoint as com.amazonaws.us-east-1.s3 service: Gateway. If I add 0.0.0.0/0 into my AWS Client VPN route table, I am able to access the website, only when connected to the VPN as expected, but I want to prevent using the VPN for general website traffic, essentially removing 0.0.0.0/0. I think I can do this with split traffic enabled on the VPN, but I don't want to keep 0.0.0.0/0 in the vpn route table if I dont need to.
So in short, is there an ip address for the vpc endpoint or which ip could I use to explicitly direct traffic to the private website?
Sadly you can't do this. S3 buckets in website mode are only available through internet. You can't make S3 website endpoint private and accessible from withing a VPC. The connections must come from the internet.
If you really want a private website, you have to host it yourself, on tiny instance or ecs container. Then you will be able to access it from within VPC only.
I am using few AWS Lambda functions, which are sitting inside private subnets,
These private subnets have VPC endpoints configured for the services for which the functions need access to,
The current setup does not use a NAT gateway, therefore all the traffic from the functions is going through the VPC endpoints.
I now have a use-case where we need to use a NAT gateway,
But would enabling NAT mean that the Functions would no longer use the VPC endpoints for external service access, and instead use the NAT?
I think this works as follows. For:
Gateway endpoints (S3, DynamoDB)
Routes to them are added automatically to our route tables when you create them. Docs says:
If you have an existing route in your route table for all internet
traffic (0.0.0.0/0) that points to an internet gateway, the endpoint
route takes precedence for all traffic destined for the service,
because the IP address range for the service is more specific than
0.0.0.0/0. All other internet traffic goes to your internet gateway, including traffic that's destined for the service in other Regions.
Interface VPC Endpoints
They work by modifying IP addresses in a DNS of a service. The IP address will be private addresses of the endpoint interfaces. Docs says:
The hosted zone contains a record set for the default DNS name for the
service (for example, ec2.us-east-1.amazonaws.com) that resolves to
the private IP addresses of the endpoint network interfaces in your
VPC. This enables you to make requests to the service using its
default DNS hostname instead of the endpoint-specific DNS hostnames.
To use private DNS, you must set the following VPC attributes to true:
enableDnsHostnames and enableDnsSupport.
Conclusion
So in both cases, priority is given to the interfaces, not the internet. I recommend checking the links provided. They have more info with examples to double check my conclusions.
VPC Endpoints or NAT Gateway?
AWS services like EC2, RDS, Lambda, and ElastiCache come with an Elastic Network Interface (ENI), which enables communication from within your VPCs via Private Endpoints. However, many AWS services provide a REST API, available via the Internet only. A few examples: S3, DynamoDB, CloudWatch, SQS, and Kinesis.
There are three options to make these services accessible from private subnets:
A VPC Endpoint type: Gateway Endpoints is free of charge, but are only available for S3 and DynamoDB.
A VPC Endpoint type: Interface Endpoint costs $7.20 per month and AZ plus $0.01 per GB and is available for most AWS services.
A NAT Gateway can be used to access AWS services or any other services with a public API. Costs are $32.40 per month and AZ plus $0.045 per GB.
Keep the following rules of thumb in mind when designing your network architecture.
Adding Gateway Endpoints for S3 and DynamoDB should be your default option.
Do you need to access non-AWS resources via the Internet, add a NAT Gateway. Do the math if traffic to AWS services justifies additional Interface Endpoints.
Are you only accessing AWS services from the private subnets? No more than four different services? Use Interface Endpoints. Otherwise, do the math to calculate costs for Interface Endpoints and NAT Gateway.
Ref Link: https://cloudonaut.io/advanved-aws-networking-pitfalls-that-you-should-avoid/
I want to implement the scenario where AWS VPC I want connectivity to both Direct Connect and IGW, so some application I can access from data center and access aws resources from internet
Yes. You can provision an Internet Gateway on the VPC to provide direct access to the Internet, and you can also provision a Virtual Private Gateway that connects to Direct Connect for connectivity to your on-premises or data center networks.
The Route Table configuration will determine which traffic is routed to each gateway.
If your only need for using an Internet Gateway is to "access aws resources from internet" (rather than making your services public), then you should put your resources in private subnets and place a NAT Gateway in a public subnet. This way, your AWS resources will have outbound Internet connectivity via the NAT Gateway without being exposed to the Internet.
I am planning to use AWS File Gateway in a hybrid environment where I will mount the File Gateway to an EC2 instance from within a private subnet. As per AWS documentation, all data transfer is done through HTTPS when using File Gateway.
But since my File Gateway, EC2 instance and S3 are all inside the AWS environment, will my File Gateway still transfer files over the internet to S3 service endpoint (s3.amazonaws.com) or will it leverage VPC endpoint for S3?
Note: I cannot use EFS for this purpose as it's not HIPAA complaint.
A VPC Endpoint for S3 uses a predefined IP prefix list in your subnet route tables, which hijacks all of the traffic bound for all of the IP addresses assigned to S3 in your region... so from a subnet associated with an S3 VPC endpoint, all traffic bound for any S3 address in the region is routed through the endpoint.
To state it another way, when correctly configured, an S3 VPC endpoint becomes the only way S3 can be accessed from the associated subnets, and because it's done at the IP routing layer, anything accessing S3 from those subnets will automatically and transparently use the endpoint.
The prefix list ID logically represents the range of public IP addresses used by the service. All instances in subnets associated with the specified route tables automatically use the endpoint to access the service; subnets that are not associated with the specified route tables do not use the endpoint.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints.html
In theory, if you configure your VPC Route Table to use the VPC Endpoint, then any traffic destined for S3 will be sent via the VPC Endpoint. (By the way, it might only work when connecting to S3 in the same region.)
Regardless, even if the traffic is routed through your Internet Gateway to the Amazon S3 endpoint, the traffic will not traverse the real "Internet" -- it will simply pass through the AWS edge of the Internet, never leaving the AWS data center (as long as it is in the same Region).