What is the purpose of Enhanced VPC routing for Redshift ?
I've read the doc https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html
but it it not clear to me.
When you create a redshift cluster, the leader node resides in a VPC / subnet.
Hence it will always use VPC routing, Security groups etc to route requests right ?
How come that redshift wouldn't use VPC traffic when performing "COPY" commands ?
Enhanced VPC routing forces the traffic to go through your VPC.
With it disabled, even if your cluster is in a VPC, it will route traffic via the public Internet instead of going through the VPC.
This is because it uses an "internal" network interface that's outside of the VPC, regardless of whether or not the cluster itself is in a VPC.
Here's a relevant excerpt from the docs:
In Amazon Redshift, network traffic created by COPY, UNLOAD, and Amazon Redshift Spectrum flow through a network interface. This network interface is internal to the Amazon Redshift cluster, and is located outside of your Amazon Virtual Private Cloud (Amazon VPC). By default, the network traffic is then routed through the public internet to reach its destination.
However, when you enable Amazon Redshift enhanced VPC routing, Amazon Redshift routes the network traffic through a VPC instead. Amazon Redshift enhanced VPC routing uses an available routing option, prioritizing the most specific route for network traffic. The VPC endpoint is prioritized as the first route priority. If a VPC endpoint is unavailable, Amazon Redshift routes the network traffic through an internet gateway, NAT instance, or NAT gateway.
Related
I have VPC with couple of subnets containing EC2 instances.
The EC2 instances have code that invokes various AWS services like dybamodb.
Is the connection from EC2 to AWS Service (like dynamodb) happening within the AWS Network, or via public internet?
Is there any way to control this?
Is the connection from EC2 to AWS Service (like dynamodb) happening within the AWS Network, or via public internet?
Technically the process on EC2 would be hitting the AWS DynamoDB public API which is on the Internet. The traffic would be routed through the Internet Gateway you have attached to the VPC. I think if it is all in the same region it may not actually leave the AWS data center, and you could try testing that via tools like traceroute, but I don't think there are any guarantees of that.
Is there any way to control this?
Yes, add a VPC Endpoint to your VPC for the service you want to connect to. Then the DNS server in your VPC will route all traffic to that service over the VPC Endpoint, instead of routing it to your VPC's Internet Gateway. The traffic will then be guaranteed to stay within the AWS network.
I am trying to connect to services and databases running inside a VPC (private subnets) from an AWS Glue job. The private resources should not be exposed publicly (e.g., moving to a public subnet or setting up public load balancers).
Unfortunately, AWS Glue doesn't seem to support running inside user defined VPCs. AWS does provide something called Glue Database Connections which, when used with the Glue SDK, magically set up elastic network interfaces inside the specified VPC for Glue/Spark worker nodes. The network interfaces then tunnel traffic from Glue to a specific database inside the VPC. However, this requires the location and credentials of specific databases, and it is not clear if and when other traffic (e.g., a REST call to a service) is tunnelled through the VPC.
Is there a reliable way to setup a Glue -> VPC connection that will tunnel all traffic through a VPC?
You can create a database connection with NETWORK connection type and use that connection in your Glue job. It will allow your job to call a REST API or any other resource within your VPC.
https://docs.aws.amazon.com/glue/latest/dg/connection-using.html
Network (designates a connection to a data source within an Amazon
Virtual Private Cloud environment (Amazon VPC))
https://docs.aws.amazon.com/glue/latest/dg/connection-JDBC-VPC.html
To allow AWS Glue to communicate with its components, specify a
security group with a self-referencing inbound rule for all TCP ports.
By creating a self-referencing rule, you can restrict the source to
the same security group in the VPC and not open it to all networks.
However, this requires the location and credentials of specific
databases, and it is not clear if and when other traffic (e.g., a REST
call to a service) is tunnelled through the VPC.
I agree the documentation is confusing, but according to this paragraph on the page you linked, it appears that all traffic is indeed tunneled through the VPC, since you have to have a NAT Gateway or VPC endpoints to allow Glue to access things outside the VPC once you have configured it with VPC access:
All JDBC data stores that are accessed by the job must be available
from the VPC subnet. To access Amazon S3 from within your VPC, a VPC
endpoint is required. If your job needs to access both VPC resources
and the public internet, the VPC needs to have a Network Address
Translation (NAT) gateway inside the VPC.
Question: What are the downsides (if any) to enabling Enhanced VPC Routing on an Amazon Redshift cluster?
According to the documentation, there is no extra charge and traffic is prevented from traveling over the public internet. Why wouldn't this be a default option always enabled?
https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html
If Enhanced VPC Routing is not enabled, Amazon Redshift routes traffic through the internet, including traffic to other services within the AWS network.
There is no additional charge for using Enhanced VPC Routing. You might incur additional data transfer charges for certain operations. These include such operations as UNLOAD to Amazon S3 in a different AWS Region. COPY from Amazon EMR, or Secure Shell (SSH) with public IP addresses.
Background: We have a Redshift cluster that intermittently drops ODBC connections with a TCP reset, but only when enhanced VPC routing is enabled.
The Query Editor in the Redshift console does not support clusters with Enhanced VPC Routing enabled. That is the only downside that I know of.
I have a private M2M GSM network for my company devices.
I want to send traffic from my devices to AWS IOT but the M2M provider doesn't allow internet access from its sim cards, it only provide an IPSec connexion to a a private network.
I had now problem configuring the IPSec connexion to an AWS VPC and my sims can successfully ping all instance in my AWS VPC. However what I want is for my sims to access AWS IOT.
What I did:
I configured my VPN with AWS third scenario. I have a public network with CIDR 192.168.0.0/24 and a private network with CIDR 192.168.1.0/24. My VPN has a static route CIDR 10.1.128.0/14 for my M2M network.
Then I launched an EC2 Nat Instance inside my public network.
I added a routing rule to my VPC main routing table to route trafic to 0.0.0.0/0 to my NAT instance.
I launched an EC2 instance in my VPC's private network and try to access internet from it, this work and I can see trafic going throung my nat instance. So I assume my nat and routing is well configured.
However I still can't manage to access internet from my sim cards, traffic isn't even routed to my NAT instance. According to John Rotenstein's answer VPN traffic will not use my routing rule.
Does AWS VPN drop traffic which is not destinated to the VPC's or VPN's CIDR ? Is there a security reason for that ?
If that's the case is there a way to customize routing rules for the VPN's traffic ? Or is the only solution to use a custom VPN within an EC2 instance ?
Thank you for your help.
I added a routing rule to my VPC main routing table to route trafic to 0.0.0.0/0 to my NAT instance.
It is an understandable misconception that the "main" route table of a VPC impacts traffic coming in from a VPC hardware VPN. It doesn't. There is no route table that applies to such traffic, only the implicit target of the VPC subnets. Only the assigned CIDR blocks can be reached from such a VPN.
Does AWS VPN drop traffic which is not destinated to the VPC's or VPN's CIDR? Is there a security reason for that?
Yes, that traffic is dropped.
It probably not specifically for security reasons... it's just the way the service was designed to work. Managed VPN connections are intended for access to instance-based services, and don't support traffic flows we might generally categorize as gateway, edge-to-edge, peering, or transit.
If you can configure your edge devices to use a web proxy, then a forward proxy server like squid could handle the connectivity for the devices, because the IP path between a device and a forward proxy is a connection involving only the device and proxy IPs.
A simpler solution would be to use an instance-based firewall to terminate the VPN, instead of the built-in VPC VPN service, because then the firewall instance could allow the traffic to hairpin through itself, source-masquerading (NAT) the traffic behind its own EIP, and this would be something the VPC infrastructure easily supports.
An instance-based firewall is something you can build yourself, of course, but there are also several products in the AWS Marketplace that provide IPSec tunnel termination and NAT capability. Some have free trial periods where the only cost is the cost of the instance.
Is it save to transfer unsecured http messages between two ec2 instances within the same vpc in aws?
Or is it necessary to use ssh tunneling etc?
It's safe in the sense that only your instances exist in the VPC. So the traffic between your two instances in your VPC cannot be sniffed by a 3rd party.
Amazon Virtual Private Cloud (Amazon VPC) lets you provision a
logically isolated section of the Amazon Web Services (AWS) Cloud
where you can launch AWS resources in a virtual network that you
define. You have complete control over your virtual networking
environment, including selection of your own IP address range,
creation of subnets, and configuration of route tables and network
gateways.
Source: Amazon VPC